Long pre-mRNA depletion and RNA missplicing contribute to
Transcription
Long pre-mRNA depletion and RNA missplicing contribute to
articles © 2011 Nature America, Inc. All rights reserved. Long pre-mRNA depletion and RNA missplicing contribute to neuronal vulnerability from loss of TDP-43 Magdalini Polymenidou1,2,6, Clotilde Lagier-Tourenne1,2,6, Kasey R Hutt2,3,6, Stephanie C Huelga2,3, Jacqueline Moran1,2, Tiffany Y Liang2,3, Shuo-Chien Ling1,2, Eveline Sun1,2, Edward Wancewicz4, Curt Mazur4, Holly Kordasiewicz1,2, Yalda Sedaghat4, John Paul Donohue5, Lily Shiue5, C Frank Bennett4, Gene W Yeo2,3 & Don W Cleveland1,2 We used cross-linking and immunoprecipitation coupled with high-throughput sequencing to identify binding sites in 6,304 genes as the brain RNA targets for TDP-43, an RNA binding protein that, when mutated, causes amyotrophic lateral sclerosis. Massively parallel sequencing and splicing-sensitive junction arrays revealed that levels of 601 mRNAs were changed (including Fus (Tls), progranulin and other transcripts encoding neurodegenerative disease–associated proteins) and 965 altered splicing events were detected (including in sortilin, the receptor for progranulin) following depletion of TDP-43 from mouse adult brain with antisense oligonucleotides. RNAs whose levels were most depleted by reduction in TDP-43 were derived from genes with very long introns and that encode proteins involved in synaptic activity. Lastly, we found that TDP-43 autoregulates its synthesis, in part by directly binding and enhancing splicing of an intron in the 3′ untranslated region of its own transcript, thereby triggering nonsense-mediated RNA degradation. Amyotrophic lateral sclerosis (ALS) is an adult-onset disorder in which premature loss of motor neurons leads to fatal paralysis. Most cases of ALS are sporadic, with only 10% of affected individuals having a familial history. A breakthrough in understanding ALS pathogenesis was the discovery that TDP-43, which in the normal setting is primarily nuclear, mislocalizes and forms neuronal and glial cytoplasmic aggregates in ALS1,2, frontotemporal lobar degeneration (FTLD), and in Alzheimer’s and Parkinson’s diseases (reviewed in ref. 3). Dominant mutations in TDP-43 were subsequently identified as being causative in sporadic and familial ALS cases and in rare individuals with FTLD4–7. At present, it is unresolved as to whether neurodegeneration is a result of a loss of TDP-43 function or a gain of toxic property or a combination of the two. However, a notable feature of TDP-43 pathology is TDP43 nuclear clearance in neurons containing cytoplasmic aggregates, consistent with pathogenesis being driven, at least in part, by a loss of TDP-43 nuclear function1,7. Several lines of evidence suggest an involvement of TDP-43 in multiple steps of RNA metabolism, including transcription, splicing or transport of mRNA3, as well as microRNA metabolism8. Misregulation of RNA processing has been described in a growing number of neurological diseases9. The recognition of TDP-43’s role in neurodegeneration and the recent identification of ALS-causing mutations in FUS/TLS10,11, another RNA/DNA binding protein, has reinforced a crucial role for RNA processing regulation in neuronal integrity. However, a comprehensive protein-RNA interaction map for TDP-43 and identification of post-transcriptional events that may be crucial for neuronal survival remain to be established. A common approach for identifying specific RNA-binding protein targets or aberrantly spliced isoforms related to disease has been through selection of candidate genes. However, recent advances in DNA sequencing technology have provided powerful tools for exploring transcriptomes at remarkable resolution12. Moreover, cross-linking, immunoprecipitation and high-throughput sequencing (CLIP-seq or HITS-CLIP) experiments have found that a single RNA-binding protein can have previously unrecognized roles in RNA processing and affect many alternatively spliced transcripts13–15. We used these approaches to identify a comprehensive TDP-43 protein-RNA interaction map in the CNS. After depletion of TDP-43 in vivo, RNA sequencing and splicing-sensitive microarrays were used to determine that TDP-43 is crucial for maintaining normal levels and splicing patterns of >1,000 mRNAs. The most downregulated of these TDP-43–dependent RNAs have pre-mRNAs with very long introns that contain multiple TDP43–binding sites and encoded proteins related to synaptic activity. The nuclear TDP-43 clearance widely reported in TDP-43 proteinopathies1,7 will lead to a disruption of this role on long RNAs that are preferentially expressed in brain, thereby contributing to neuronal vulnerability. 1Ludwig Institute for Cancer Research, University of California at San Diego, La Jolla, California, USA. 2Department of Cellular and Molecular Medicine, University of California at San Diego, La Jolla, California, USA. 3Stem Cell Program and Institute for Genomic Medicine, University of California at San Diego, La Jolla, California, USA. 4Isis Pharmaceuticals, Carlsbad, California, USA. 5RNA Center, Department of Molecular, Cell and Developmental Biology, Sinsheimer Labs, University of California, Santa Cruz, California, USA. 6These authors contributed equally to this work. Correspondence should be addressed to D.W.C. ([email protected]) or G.W.Y. ([email protected]). Received 20 December 2010; accepted 14 February 2011; published online 27 February 2011; doi:10.1038/nn.2779 nature neuroscience advance online publication 1 RESULTS Protein-RNA interaction map of TDP-43 in mouse brain We used CLIP-seq to identify in vivo RNA targets of TDP-43 in adult mouse brain (Supplementary Fig. 1a). After ultraviolet irradiation to stabilize in vivo protein-RNA interactions, we immunoprecipitated TDP-43 with a monoclonal antibody16 that had a higher immunoprecipitation efficiency than any of the commercial antibodies tested (Supplementary Fig. 1b). Complexes representing the expected molecular weight of a single molecule of TDP-43 bound to its target RNAs were excised (Fig. 1a) and sequenced. We also observed lower mobility protein-RNA complexes whose abundance was reduced by increased nuclease digestion. Immunoblotting of the same immunoprecipitated samples before radioactive labeling of the target RNAs revealed that TDP-43 protein was a component of both the ~43-kDa complex and more slowly migrating complexes (Fig. 1a). We performed two independent experiments and obtained 5,341,577 and 12,009,500 36-bp sequence reads. Mapped reads of both experiments were predominantly in protein-coding genes, with ~97% of them oriented in the direction of transcription, confirming little DNA contamination. The positions of mapped reads from both experiments were highly consistent, as exemplified by TDP-43 binding on the semaphorin 3F transcript (Fig. 1b). We used a cluster-finding algorithm with gene-specific thresholds that accounted for pre-mRNA length and variable expression levels15,17 to Figure 1 TDP-43 binds distal introns of premRNA transcripts through UG-rich sites in vivo. (a) Autoradiograph of TDP-43-RNA complexes trimmed by different concentrations of micrococcal nuclease (MNase, left). Complexes in red box were used for library preparation and sequencing. Immunoblot showing TDP-43 in ~46 kDa and higher molecular weight complexes dependent on ultraviolet (UV) crosslinking (right). (b) Example of a TDP-43–binding site (CLIP cluster) on Semaphorin 3F defined by overlapping reads from two independent experiments surpassing a gene-specific threshold. (c) University of California Santa Cruz (UCSC) Genome Browser screenshot of neurexin 3 intron 8 (mm8; chr12:89842000–89847000) displaying three examples of TDP-43–binding modes. The right-most CLIP cluster represents a canonical binding site coinciding GU-rich sequence motifs, whereas the left-most cluster lacks GU-rich sequences and a region containing multiple GU repeats showed no evidence of TDP-43 binding. The second CLIP cluster (middle purple-outlined box) with weak binding was found only when relaxing cluster-finding algorithm parameters. (d) Flow-chart illustrating the number of reads analyzed from both CLIPseq experiments. (e) Histogram of Z scores indicating the enrichment of GU-rich hexamers in CLIP-seq clusters compared with equally sized clusters, randomly distributed in the same premRNAs. Sequences and Z scores of the top eight hexamers are indicated. Pie charts enumerate clusters containing increasing counts of (GU)2 compared with randomly distributed clusters (P ≈ 0, χ2 = 21,662). (f) Pre-mRNAs were divided into annotated regions (top). Distribution of TDP-43 (left) or previously published Argonaute-binding sites21 as a control (right) showed preferential binding of TDP-43 in distal introns. 2 identify TDP-43–binding sites from clusters of sequence reads (Fig. 1b), with a conservative threshold (the number of reads mapped to a cluster had to exceed the expected number by chance at P < 0.01). This stringent definition will miss some true binding sites, but was intentionally chosen to identify the strongest bound sites while minimizing false positives. Indeed, additional probable binding sites could be identified by inspection of reads mapped to specific RNAs (see Fig. 1c). Moreover, similarly defined clusters from the low mobility complexes (Supplementary Fig. 1b) showed a 92% overlap with those from the monomeric complexes (Fig. 1a), consistent with the reduced mobility complexes comprising multiple TDP-43s (or other RNA-binding proteins) bound to a single RNA. Genome-wide comparison of our replicate experiments revealed (Supplementary Fig. 1c) that the vast majority (90%) of TDP-43– binding sites in experiment 1 overlapped with those in experiment 2 (compared with an overlap of only 8% (P ≈ 0, Z = 570) when clusters were randomly distributed across the length of the pre-mRNAs containing them). Combining the mapped sequences yielded 39,961 clusters, representing binding sites of TDP-43 in 6,304 annotated protein-coding genes, approximately 30% of the murine transcriptome (Fig. 1d). We computationally sampled reads (in 10% intervals) from the CLIP sequences and found a clear logarithmic relationship (Supplementary Fig. 1d), from which we calculated that our current dataset contains ~84% of all TDP-43 RNA targets in the mouse brain. Comparison with the mRNA targets identified from primary rat neuronal cells18 by RNA a b MNase UV: + α–TDP-43 + + + MNase α–TDP-43 IgG + + + – + IgG + 1 kb CLIP reads experiment 1 CLIP reads experiment 2 80 58 46 30 CLIP reads combined 25 17 7 CLIP cluster RNA c Protein (62 reads) Sema3F d 2 kb CLIP experiment 1 5,341,577 reads CLIP reads combined CLIP experiment 2 12,009,500 reads Combine reads 17,351,077 reads Trim and map 3,473,413 reads CLIP clusters 97.50% sense UGUGUG Cluster definition e 12 8 TDP-43 clusters Randomly distributed GUGU count clusters GUGU count Motif Z score augugu 158 uguagu 156 guaugu 146 ugugua 143 uaugug 138 guguau 137 9% 9% 10% 10% 57% 15% 12% 0 1 2 Motif Z score ugugug 462 gugugu 453 100 Z score 300 3 f Distal intron 5’ UTR 46% 21% 11% 4 0 Direction of transcription in genes 39,961 clusters in 6,304 genes Nrxn 3 intron 8 log2 (motif count) © 2011 Nature America, Inc. All rights reserved. articles Proximal intron Not annotated 3’ UTR 2 kb Exon 2 kb 2% 1% 3% 1% 30% >3 63% TDP-43 CLIP binding 4% 3% 28% 24% 27% 14% Ago CLIP binding advance online publication nature neuroscience articles intronic clusters being >2 kb from the nearest exon-intron boundary (Fig. 1f). This number rose to 82% for clusters >500 bases from the nearest exon-intron boundary (Supplementary Fig. 2b). Such distal intronic binding is in sharp contrast with published RNA-binding maps for tissue-specific RNA binding proteins involved in alternative splicing, such as Nova or Fox214,15. The same analysis on published data in mouse brain for the Argonaute proteins17,21, which are recruited by microRNAs to the 3′ ends of genes in metazoans17,21, showed a substantially different pattern of binding. Only 24% (or 30%) of Argonaute clusters resided within 2 kb (or 500 bases) of the nearest exon-intron boundary, whereas 28% were in 3′ untranslated regions (3′ UTRs) (Fig. 1f and Supplementary Fig. 2b). This prominent concentration of Argonaute binding near 3′ ends is in stark contrast with the uniform distribution of TDP-43–binding sites across the length of pre-mRNAs (Supplementary Fig. 2c). TDP-43 binds GU-rich distal intronic sites Sequence motifs enriched in TDP-43–binding sites were determined by comparing sequences in clusters with randomly selected regions of similar sizes in the same protein-coding genes. Z score statistics revealed that the most significantly enriched hexamers consisted of GU-repeats (Z > 450), consistent with published in vitro results20, or a GU-rich motif interrupted by a single adenine (Z = 137–158) (Fig. 1e). The majority (57%) of clusters contained at least four GUGU elements, as compared with only 9% RNAs altered after in vivo TDP-43 depletion in mouse brain when equally sized clusters were randomly placed in the same pre-mRNAs To identify the contribution of TDP-43 in maintaining levels and (Fig. 1e). Furthermore, the number of GUGU tetramers correlated with splicing patterns of RNAs, we injected two antisense oligonucleotides the strength of binding, as estimated by the relative number of reads in each (ASOs) directed against TDP-43 and a control ASO with no target in cluster per gene compared with all clusters in other genes (Supplementary the mouse genome into the striatum of normal adult mice (Fig. 2a). Fig. 2a). Nevertheless, genome-wide analysis revealed that GU-rich Striatum is a well-defined structure that is amenable to accurate dissecrepeats were neither necessary nor sufficient to specify a TDP-43–binding tion and isolation, with TDP-43 expression levels being comparable to site. One example is the left-most binding site in neurexin 3 (Fig. 1c), other brain regions. Stereotactic injections of ASOs that target TDP-43, which does not have a GU motif, whereas a GU-rich motif 2-kb upstream control ASO or saline were performed in three groups of age- and sexof it was not bound by TDP-43. In fact, only ~3% of all transcribed matched adult C57BL/6 mice and were tolerated with minimal effects 300 nucleotide stretches containing more than a TDP-43 mRNA three GUGU tetramers contained TDP-43 clusStereotactic injection of ASOs AAAAAAAAAA in the striatum of normal mice ters by CLIP-seq, indicating that TDP-43 target TDP-43–specific ASO RNA-seq genes cannot be identified by simply scanning RNA qRT-PCR extraction nucleotide sequences for GU-rich regions. RT-PCR RNase H Although the vast majority (93%) of TDPAAAAAAAAAA 43 sites were in introns, we identified a bindProtein ing preference of TDP-43 with most (63%) Immunoblot extraction A AA AA Dissection of the striatum A TDP-43 pre-mRNA degradation b c Control oligo TDP-43 KD Control oligo 342,150,462 reads 72 nt long (n = 4) Saline TDP-43 120 100 80 60 40 20 0 120 100 80 60 40 20 0 94.56% sense qRT-PCR e nature neuroscience advance online publication 2 4 6 8 Expression control (log2 RPKM) 10 qRT-PCR 100 80 60 40 20 0 Meg3/Gtl2 Rian Malat1/Neat2 Xist 120 100 80 60 40 20 0 e C o ol ntr ig o o l TD ol Pig 43 o 2 120 Saline Control RNA-seq TDP-43 KD Percentage Meg3/Gtl2 TDP-43 f 140 lin 6 4 11,852 expressed genes Saline Sa TDP-43 KD Downregulated genes (239) Upregulated genes (362) 0 94.33% sense 0 8 0 103,514,455 reads Direction of transcription in genes Percentage ncRNA expression 10 127,725,000 reads Densitometric analysis Control oligo d TDP-43 KD 294,180,255 reads 72 nt long (n = 4) Trimmed and mapped GAPDH Percentage Percentage TDP-43 TDP-43 mRNA protein Figure 2 In vivo depletion of TDP-43 in mouse brain with ASOs. (a) Strategy for depletion of TDP-43 in mouse striatum. TDP-43–specific or control ASOs were injected into the striatum of adult mice. TDP-43 mRNA was degraded via endogenous RNase H digestion, which specifically recognizes ASO–pre-mRNA (DNA/ RNA) hybrids. Mice were killed after 2 weeks and striata were dissected for RNA and protein extraction. (b) Semi-quantitative immunoblot demonstrating substantial depletion of TDP-43 protein to 20% of controls in TDP-43 ASO–treated mice compared with saline- and control ASO–treated mice. qRT-PCR showed similar TDP-43 depletion at the mRNA level. KD, knockdown. (c) Flow-chart illustrating the number of reads sequenced and aligned from the RNA-seq experiments. (d) Differentially regulated genes identified by RNA-seq analysis. Scatter-plot revealed the presence of 362 and 239 genes (diamonds) that were significantly up- (red) or down- (green) regulated after TDP-43 depletion. RNA-seq analysis confirmed that TDP-43 levels were reduced to 20% of control animals. (e) Normalized expression (based on RPKM values from RNA-seq) of four noncoding RNAs that were TDP-43 targets and were downregulated after TDP-43 depletion. (f) Quantitative RT-PCR validation of noncoding RNA Meg3/Gtl2. Expression TDP-43 knockdown (log2 RPKM) © 2011 Nature America, Inc. All rights reserved. immunoprecipitation (RIP, an approach with the serious caveat that the absence of cross-linking allows re-association of RNAs and RNAbinding proteins after cell lysis, as previously documented19) revealed 2,672 of the genes with CLIP-seq clusters in common. As expected from our CLIP-seq analysis in whole brain, we found strong representation of neuronal (see below) and glial mRNA targets, including glutamate transporter 1 (Glt1), myelin-associated glycoprotein (Mag) and myelin oligodendrocyte glycoprotein (Mog). 3 articles Relative mRNA expression (fold change) Cluster count Mean cluster count in introns Total intron length (kb) b on the survival of the animals. Mice were killed a 40 600 200 Upregulated Downregulated 2 after 2 weeks and total RNA and protein from Saline Control oligo 35 500 150 TDP-43 KD striata were isolated (Fig. 2a). Samples treated 30 1.5 100 with TDP-43 ASO showed a significant and 400 25 reproducible reduction of TDP-43 RNA and 50 300 20 1 protein to approximately 20% of normal levels 0 15 when compared with controls (P < 3 × 10–5; 200 10 0.5 Fig. 2b). 100 5 To explore the effects of TDP-43 downregulation on its target RNAs, we converted 0 0 0 Nrxn3 Park2 Nlgn1 Fgf14 Kcnd2 Cadps Efna5 Chat Most Most Not downregulated poly-A–enriched RNAs from four biological upregulated regulated replicates of TDP-43 or control ASO-treated, c 1.0 Mouse exons Mouse introns Human introns as well as three saline-treated animals, to P < 6 x 10 P < 5.3 x 10 0.9 P < 0.0872 0.8 cDNAs and sequenced them in a strand0.7 specific manner22, yielding an average of >25 0.6 0.5 million 72-bp reads per library. The number 0.4 of mapped reads per kilobase of exon, per milEnriched in brain 0.3 Not enriched in brain lion mapped reads (RPKM) for each annotated 0.2 Random subset 0.1 Not in random subset protein-coding gene was determined to estab0 12 10 10 10 10 10 10 10 10 10 lish a metric of normalized gene expression . Median exon length (nt) Median intron length (nt) Median intron length (nt) Hierarchical clustering of gene expression values for the independent samples revealed Figure 3 Binding of TDP-43 on long transcripts enriched in brain sustains their normal mRNA levels. high correlation (R2 = 0.96) between biologi- (a) Correlation between RNA-seq and CLIP-seq data. Genes were ranked on their degree of regulation cal replicates of each condition (TDP-43 and after TDP-43 depletion (x axis) and the mean number of intronic CLIP clusters found in the next 100 control ASO/saline) (Supplementary Fig. 3a). genes from the ranked list were plotted (y axis, green line). Similarly, the mean total intron length for the next 100 genes was plotted (y axis, red line). The cluster count for each upregulated gene (red dots) Notably, all control ASO–treated samples were and each downregulated gene (green dots) was plotted using the same ordering (inset). (b) qRT-PCR clustered together, as were the samples from for selected downregulated genes with long introns (except for Chat) revealed a significant reduction TDP-43 ASO–treated animals, consistent with of all transcripts when compared with controls (P < 8 × 10−3). Error bars represent s.d. calculated TDP-43 reduction having an appreciable effect in each group for 3–5 biological replicates. (c) Cumulative distribution plots comparing exon length (left) or intron length (middle) across mouse brain tissue–enriched genes (388 genes) and non-brain on regulation of gene expression. Reads of each treatment group were com- tissue–enriched genes (15,153 genes). Genes enriched in brain had significantly longer median intron −6 bined, yielding greater than 100 million length compared with genes not enriched in brain (right, solid red line and black lines, P < 6.2 × 10 by two-sample Kolmogorov Smirnov goodness-of-fit hypothesis test), whereas a random subset of uniquely mapped reads per condition (Fig. 2c). 388 genes showed no difference in intron length (dashed lines). Similar analysis across human brain Approximately 70% (11,852) of annotated pro- tissue–enriched genes (387 genes) and non-brain tissue–enriched genes (17,985 genes) also showed tein-coding genes in mouse satisfied at least 1 significantly longer introns in brain enriched genes (solid red and black lines, P < 5.3 × 10−6), whereas a RPKM in either condition. Statistical compari- random subset of 387 genes showed no difference in intron length (dashed lines). son revealed that 362 genes were significantly upregulated and 239 downregulated on reduction of TDP-43 protein Of the RNAs containing TDP-43 clusters, 96% were unaffected by TDP(P < 0.05; Fig. 2d and Supplementary Tables 1 and 2). We found that 43 depletion, suggesting either that other RNA-binding proteins comTDP-43 itself was downregulated by RPKM analysis to 20% of the levels pensate for TDP-43 loss or that the remaining 20% of TDP-43 protein in control treatments, consistent with quantitative RT-PCR (qRT-PCR) suffices to regulate these transcripts. For the 239 RNAs downregulated measurements (Fig. 2b). RNAs unique to neurons (including doublecor- after TDP-43 depletion, a striking enrichment of multiple TDP-43 bindtin, the neuron-specific beta-tubulin and choline O-acetyl transferase ing sites was observed (Fig. 3). In fact, the 100 most downregulated (Chat)) or glia (including glial fibrillary acidic protein, myelin binding genes contained an average of ~37 TDP-43–binding sites per pre-mRNA protein, Glt1 and Mag) were highly represented in the RNA-seq data, con- and 12 genes had more than 100 clusters (Fig. 3a and Supplementary firming assessment of RNA levels in multiple cell types, as expected. Table 2). We did not observe this bias for multiple TDP-43 binding Of the set of ~242 literature-curated murine noncoding RNAs23, sites if we randomized the order of genes (Supplementary Fig. 4a), expression of 4 increased and expression of 55 decreased by more than if we ordered them by their expression levels (RPKM) in either treattwofold on TDP-43 depletion (P < 10−5; Supplementary Table 3). ment (Supplementary Fig. 4b,c), or if the Argonaute-binding sites were Malat1/Neat2, Xist, Rian and Meg3 are noncoding RNA examples that plotted on genes ranked by their expression pattern on TDP-43 deplewere both decreased (Fig. 2e,f) and bound by TDP-43, consistent with tion (Supplementary Fig. 4d). Furthermore, this trend was significant for TDP-43 clusters found in introns (Fig. 3a), but not in exons, 5′ or TDP-43 having a direct role in regulating their expression. 3′ UTRs (Supplementary Fig. 4e–g). To address whether TDP-43 binding enrichment in the downTDP-43 binding to long pre-mRNAs sustains their levels RNA-seq and CLIP-seq datasets were integrated by first ranking all regulated genes could be attributed to intron size, we performed the 11,852 expressed genes by their degree of change on TDP-43 reduc- same analysis on the ranked list, but calculated total (Fig. 3a) or mean tion compared with control treatment. For each group of 100 consec- (Supplementary Fig. 4h) intron size instead of cluster counts. We utively ranked genes (starting from the most upregulated gene), we found that the most downregulated genes after TDP-43 reduction determined the mean number of TDP-43 clusters. No enrichment in had exceptionally long introns that were more than sixfold longer TDP-43 clusters in the upregulated genes was identified, indicating that (average of 28,707 bp; median length of 11,786 bp) than unaffected their upregulation was likely an indirect consequence of TDP-43 loss. or upregulated genes (average of 4,532 bp, median length of 2,273 bp, –6 © 2011 Nature America, Inc. All rights reserved. Frequency –6 2 4 3 4 2 3 4 2 3 4 advance online publication nature neuroscience articles Table 1 The fraction of downregulated, but not upregulated TDP-43 targets increased with intron size Mean intron length (kb) Expressed genes TDP-43 targets Downregulated genes Downregulated TDP-43 targets Upregulated genes Upregulated TDP-43 targets 0–1 2,566 510 (20%) 31 8 (26%) 100 7 (7%) 1–10 8,022 4,485 (56%) 80 47 (59%) 252 56 (32%) 10–100 1,238 1,027 (83%) 109 104 (95%) 10 3 (30%) >100 26 26 (100%) 19 19 (100%) 0 0 (0%) Total 11,852 6,048 (51%) 239 178 (74%) 362 66 (18%) © 2011 Nature America, Inc. All rights reserved. Number of TDP-43 targets among expressed, downregulated or upregulated genes, grouped by their mean intron length as indicated. The percentages in parentheses represent the fraction of TDP-43 targets found in each group upon TDP-43 depletion. P < 4 × 10−18 by t test). Again, this correlation of downregulation with intron size was not observed for any of the control conditions mentioned above (Supplementary Fig. 4a–g). Indeed, the enrichment of TDP-43 binding can be largely attributed to intron size differences, as the number of TDP-43–binding sites per kilobase of intron length (cluster density; Supplementary Fig. 4i) was only slightly increased (P < 0.022) for downregulated versus unaffected or upregulated genes (0.072 sites per kb downregulated genes, 0.059 sites per kb other genes). Dividing all mouse protein-coding genes into four groups on the basis of mean intron length (<1 kb, 1–10 kb, 10–100 kb and >100 kb) confirmed that the fraction of TDP-43 targets increased (from 20% to 100%) with intron size (Table 1). Indeed, 83% of genes that contained average intron lengths of 10–100 kb and all 26 genes that contained >100-kb-long introns were direct targets of TDP-43. A highly significant fraction (74%) of all downregulated genes were direct targets of TDP-43 in comparison with genes that were unchanged (52%, P < 0.001) or upregulated (18%, P < 10−17) on TDP-43 depletion. Notably, all 19 downregulated genes with >100-kb-long introns were direct TDP-43 targets. In strong contrast, no genes in the same intron length category were upregulated on TDP-43 depletion and only 30% of upregulated genes with 10–100-kb-long introns were TDP-43 targets (Table 1). The crucial role of TDP-43 in maintaining the mRNA abundance of long intron-containing genes was also reflected by the downregulation after TDP-43 depletion of ~10% of genes with >10-kblong introns, the large majority (123 of 128, 96%) of which are direct TDP-43 targets. Gene ontology analysis showed that TDP-43 targets whose expression was downregulated after TDP-43 depletion were highly enriched for synaptic activity and function (Supplementary Figs. 5 and 6 and Supplementary Table 4). Notably, several genes with long introns targeted by TDP-43 have crucial roles in synaptic function and have been implicated in neurological diseases, such as subunit 2A of the NMDA receptor (Grin2a), the ionotropic glutamate receptor 6 (Grik2/GluR6), the calcium-activated potassium channel alpha (Kcnma1), the voltagedependent calcium channel (Cacna1c), and the synaptic cell-adhesion molecules neurexin 1 and 3 (Nrxn1, Nrxn3) and neuroligin 1 (Nlgn1). We analyzed a compendium of expression array data from different mouse organs and human tissues and found that genes preferentially expressed in brain have significantly longer introns (P < 6 × 10−6), but not exons (Fig. 3c). The length of these genes was not correlated with the size of the respective proteins and the prevalence of long introns was largely conserved between the corresponding mouse and human genes. Although binding of TDP-43 in long introns can be explained by the increased likelihood to contain UG repeats, the conservation through evolution of this particular gene structure (Supplementary Fig. 7) suggests that these exceptionally long introns contain important regulatory elements. To validate the RNA-seq results, we analyzed a selection of brainenriched TDP-43 targets containing long introns. Genome browser views of neurexin 3 (Nrxn3), Parkin 2 (Park2), neuroligin 1 (Nlgn1), fibroblast growth factor 14 (Fgf14), potassium voltage-gated channel subfamily D member 2 (Kcnd2), calcium-dependent secretion activator (Cadps) and ephrin-A5 (Efna5) revealed a scattered distribution of multiple TDP-43–binding sites across the full length of the pre-mRNA (Supplementary Fig. 7a), consistent with the results of the global analysis (Fig. 3a). qRT-PCR verified the TDP-43–dependent reduction of all the long transcripts that we tested (Fig. 3b). Chat had a median intron size of <10 kb, with TDP-43 clusters restricted to a single intronic site (Supplementary Fig. 7a). Nevertheless, qRT-PCR confirmed the RNA-seq result of a significant reduction in Chat levels after TDP-43 depletion (P = 0.005; Fig. 3b). Only 18% of the upregulated genes were direct targets of TDP43 (Table 1) and gene ontology analysis revealed an enrichment for genes involved in the inflammatory response (Table 2), suggesting that their differential expression is an indirect consequence of TDP43 loss. However, of the 66 upregulated RNAs that contained CLIPseq clusters, 29% harbored TDP-43–binding site(s) in their 3′ UTR, a percentage that is twofold higher than that of downregulated genes (Supplementary Fig. 8). This suggests that TDP-43 represses gene expression when bound to 3′ UTRs. Table 2 Main functional categories enriched within the TDP-43–regulated genes GO Downregulated genes (239) categories Molecular function Cellular component Biological process GO term GO:0006811 Ion transport Upregulated genes (362) Corrected P value GO term Corrected P value 2.31 × 10–8 GO:0006952 Defense response 2.28 × 10–20 10–7 GO:0007268 Synaptic transmission 1.79 × GO:0006954 Inflammatory response 4.97 × 10–12 GO:0045202 Synapse 5.33 × 10–8 GO:0000323 Lytic vacuole 1.70 × 10–10 GO:0005886 Plasma membrane 2.35 × 10–13 GO:0005764 Lysosome 1.70 × 10–10 GO:0005216 Ion channel activity 1.49 × 10–11 GO:0004197 Cysteine-type endopeptidase activity 6.00 × 10–4 GO:0022803 Passive transmembrane 7.06 × 10–12 transporter activity GO:0003950 NAD+ ADPribosyltransferase activity 1.11 × 10–2 Examples of the most significantly enriched Gene Ontology (GO) terms in the list of up- or downregulated genes on TDP-43 knockdown, as annotated. Downregulated genes, the majority of which were direct targets of TDP-43, were enriched for synaptic activity and function. In contrast, upregulated genes, most of which were not bound by TDP-43, encoded primarily inflammatory and immune response–related proteins. The P value indicated was corrected for multiple testing using the Benjamini-Hochberg method. nature neuroscience advance online publication 5 a Constitutive all exons cassette EST/mRNA defined Included all exons excluded RNA-seq defined P < 3 x 10–6 P < 2 x 10–3 Included array exons excluded Junction array defined P < 10–3 0 b 10 20 30 Percent of exons with TDP-43 binding within 2 kb 2 kb Inclusion 76% (480 reads) TDP-43 KD Control CLIP clusters RNA-seq control 19% (117 reads) RNA-seq TDP-43 KD 24% (147 reads) Sort1 81% (498 reads) Exclusion c Human exons orthologous to mouse Human EST/mRNA defined cassette exons Human constitutive exons Human, other types of alternative splicing d Exons included on Exons excluded on TDP-43 depletion in mouse TDP-43 depletion in mouse 3% 7% 12% 85% 36% 3 2 1 0 3 2 1 0 Control KD Sort1 Sema3f 0.4 1.5 1 0.5 0 3 2 1 0 Pdp1 2 1 0 Ddef2 2 1 0 Mpzl1 69% Exons excluded on TDP-43 depletion Control KD Kcnd3 All mouse exons on junction array 9% 22% 57% Exons included on TDP-43 depletion 0.2 TDP-43 mediates alternative splicing of 0 its mRNA targets Atp2b2 8 Although TDP-43–binding sites were enriched 4 in distal introns (Fig. 1f), 11% (21,041 out of 0 Eif4h 190,161) of all mouse exons, including both 0.8 constitutive and alternative exons, contained 0.4 TDP-43–binding site(s) in a 2-kb window 0 Dnajc5 extending from the 5′ and 3′ exon-intron boundaries (Fig. 4a). Compared with all exons, TDP-43 clusters were significantly enriched (P < 8 × 10−3) around exons with transcript evidence for either alternative inclusion or exclusion (that is, cassette exons). Of the 8,637 known mouse cassette exons, 15.1% contained TDP-43–binding sites in the exon or intron within 2 kb of the splice sites. A splice index score for all exons, a measure similar to the ‘percent spliced in’ (or ψ) metric24, was determined by the number of reads that mapped on exons as well as reads that mapped at exon junctions (Fig. 4b). This analysis resulted in the identification of 203 cassette exons that were differentially included (93) or excluded (110) (P < 0.01) when TDP-43 was depleted. Notably, sortilin 1, the gene encoding the receptor for progranulin25,26, had the highest splice index score, with exon 18 exclusion requiring TDP-43 (Fig. 4b). Included exons (P < 3 × 10−6) and, to a lesser extent, excluded exons (P < 2 × 10−3) identified by RNA-seq were significantly enriched (~2.7-fold and ~2.0-fold, respectively) for TDP-43 binding when compared to all mouse exons (Fig. 4a). Only 33% of RNA-seq–verified TDP-43–regulated cassette exons had previous expressed sequence tag (EST)/mRNA evidence for alternative splicing, demonstrating that our approach identified previously unknown alternative splicing events. 6 P < 8 x 10–3 12 8 4 0 Ratio inlcusion to exclusion Figure 4 TDP-43 mediates alternative splicing regulation of its RNA targets. (a) Schematic representation of different exon classes defined by EST and mRNA libraries, RNA-seq or splicingsensitive microarray data as indicated (left). Right, bar plot displaying the percentage of exons that contain TDP-43 clusters within 2 kb upstream and downstream of the exon-intron junctions. (b) Example of alternative splicing change on exon 18 of sortilin 1 analyzed using RNA-seq reads mapping to the exon body (left). Arrows depict increased density of reads in the TDP-43 knockdown samples compared with controls. 76% of the spliced-junction reads in the TDP-43 knockdown samples supported inclusion versus only 19% in the control oligo–treated samples (right). (c) Comparison of mouse cassette exons detected by splicingsensitive microarrays to conserved exons in human orthologous genes. 85% and 57% of human exons corresponding to the excluded and included mouse exons, after TDP-43 depletion, respectively (left and middle pie charts), contained EST and mRNA evidence for alternative splicing. As a control, the percentage of human exons orthologous to all mouse exons represented on the splicing-sensitive array and that have alternative splicing evidence are shown in the right pie chart. (d) Semiquantitative RT-PCR analyses of selected targets confirmed alternative splicing changes in TDP-43 knockdown samples compared with controls. Right, representative acrylamide gel pictures of RT-PCR products from control or knockdown adult brain samples. Quantification of splicing changes from three biological replicates per group (left); error bars represent s.d. Ratio inlcusion to exclusion © 2011 Nature America, Inc. All rights reserved. articles 3 2 1 0 Control KD 4 2 0 4 2 0 Control KD 2 Ppp3ca Dclk1 0 15 10 5 0 Dst Tsn 4 4 2 0 4 2 Rbm33 ex3 ex2&3 0 Poldip3 Control (n = 3) TDP-43 KD (n = 3) Kcnip2 As an independent method of identifying TDP-43–regulated exons, RNAs from the same ASO-treated animals were analyzed on customdesigned splicing-sensitive Affymetrix microarrays27. Using a conservative statistical cutoff, we detected 779 alternatively spliced events that significantly changed on TDP-43 depletion (Supplementary Fig. 9). Notably, included (P < 10−3), but not excluded, exons (P < 0.3) were significantly enriched for TDP-43 binding (~1.8-fold and ~1.3-fold, respectively) when compared with the unchanged exons on the microarray (Fig. 4a), similar to the trend seen by RNA-seq. The combined RNA-seq and splicing-sensitive microarray data defined a set of 512 alternatively spliced cassette exons whose splicing was affected by loss of TDP-43. The majority of human orthologs of these murine exons (85% of those with excluded and 57% with included exons) had prior EST/mRNA evidence for alternative splicing (Fig. 4c). Semi-quantitative RT-PCR on selected RNAs validated splicing alterations with more inclusion or exclusion after TDP-43 reduction (Fig. 4d and Supplementary Fig. 10). Notably, varying the extent of TDP-43 downregulation (between 40–80%) correlated with the magnitude of splicing changes (Supplementary Fig. 11). However, the majority of advance online publication nature neuroscience articles b 5 kb 120 P < 4 x 10 –3 WT 80 GAPDH 60 100 80 60 40 20 0 40 20 0 d WT TDP-43 Tg Tet-on: 24 h GFP–TDP-43: – + + GFP-TDP-43 Endogenous TDP-43 30 25 UGUGUG 1 2 3 Tardbp WT TDP-43 Tg 48 h – 58 46 CLIP clusters TDP-43 Tg myc–TDP-43 end. TDP-43 100 CLIP reads combined c Percentage endogenous TDP-43 protein a Percentage endogenous TDP-43 mRNA altered splicing events observed on TDP-43 depletion do not have TDP-43 clusters within 2 kb of the splice sites, implicating longerrange interactions or indirect effects of TDP43 through other splicing factors. Consistent with this latter hypothesis, we identified TDP43 binding on pre-mRNAs of RNA-binding proteins, including Fus/Tls, Ewsr1, Taf15, Adarb1, Cugbp1, RBFox2 (Rbm9), Tia1, Nova 1 and 2, Mbnl, and neuronal Ptb (or Ptbp2). After TDP-43 depletion, mRNA levels of Fus/Tls and Adarb1 were reduced and exon 5 in the Tia1 transcript was more included, whereas exon 10 in Ptbp2 was more excluded (Supplementary Table 5). GAPDH TDP-43 autoregulation through binding on its 3′ UTR e isoform 3–specific primers f g P < 3 x 10 We found TDP-43–binding sites in an alterP < 2 x 10 180 TDP-43 3’ UTR 3 Unrelated 3’ UTR 40 natively spliced intron in the 3′ UTR of the 160 Short TDP-43 3’ UTR P < 10 TDP-43 pre-mRNA (Fig. 5a). This binding 140 Long TDP-43 3’ UTR 30 100 P < 10 120 does not coincide with a long stretch of UG P < 10 80 100 20 repeats, suggesting a lower strength of bindP < 10 60 80 ing (Supplementary Fig. 2a), consistent with a 10 40 60 recent report28. TDP-43 mRNAs spliced at this 40 20 0 20 0 site (Fig. 5a) are predicted to be substrates for myc–TDP-43–HA – + – + Tet-on: 48 24 48 0 nonsense-mediated RNA decay (NMD), a prosiRNA UPF1 – – + + GFP–TDP-43: – + + No co-transfection RFP myc–TDP-43–HA siRNA UPF1 cess that targets mRNAs for degradation when exon-junction complexes (EJCs) deposited Figure 5 Autoregulation of TDP-43 through binding on the 3′ UTR of its own transcript. (a) CLIP-seq during splicing, located 3′ of the stop codon, reads and clusters on TDP-43 transcript showing binding mainly in an alternatively spliced part of the are not displaced during the pioneer round 3′ UTR, lacking long uninterrupted UG repeats. (b) qRT-PCR showing ~50% reduction of endogenous of translation29. In contrast, TDP-43 mRNAs TDP-43 mRNA in transgenic mice overexpressing human myc–TDP-43 not containing introns and 3′ UTR. (c) Immunoblots confirming the reduction of endogenous TDP-43 protein (upper panel) with an unspliced 3′ UTR would not have such to 50% of control levels (by densitometry, lower panel) in response to human myc–TDP-43 a premature termination codon and should overexpression. (d) Immunoblot showing reduction of endogenous TDP-43 protein in HeLa cells on escape NMD. This TDP-43 binding implies tetracycline induction of a transgene encoding GFP-myc–TDP-43–HA (annotated GFP–TDP-43). The autoregulatory mechanisms reminiscent to ~30-kDa product accumulating after 48 h (red arrow) was immunoreactive with four TDP-43–specific those reported for other RNA-binding pro- antibodies. (e) qRT-PCR using primers spanning the junctions of TDP-43 isoform 3 (upper panel), teins30,31. Indeed, expression in mice of a TDP- showed ~100-fold increase in response to tetracycline induction of GFP–TDP-43. TDP-43 isoform 3 was present at very low levels before tetracycline induction. (f) Luciferase assays showing significant 43–encoding transgene without the regulatory reduction of relative fluorescence units (RFUs) in cells expressing renilla luciferase under the control 3′ UTR (E.S., S.-C.L. and D.W.C., unpublished of long TDP-43 3′ UTR when co-transfected with myc–TDP-43–HA. Cells treated with siRNA against observations) led to significant reduction of UPF1 showed a significant increase of RFUs when expressing Luciferase with the long TDP-43 endogenous TDP-43 mRNA and protein 3′ UTR, but not in controls. (g) qRT-PCR scoring levels of endogenous TDP-43 isoform 3 in HeLa cells transiently transfected with myc–TDP-43–HA, UPF1 siRNA or both simultaneously. TDP-43 isoform 3 (P < 4 × 10–3; Fig. 5b,c) in the CNS. To identify the molecular basis of this mech- was increased in response to elevated TDP-43 protein levels, blocking of NMD (by UPF1 siRNA) and there was a synergistic effect in the combined condition. anism, we generated HeLa cells in which GFPmyc–TDP-43–HA mRNA lacking introns and 3′ UTR was transcribed from a single copy, tetracycline-inducible trans- sites) of the TDP-43 3′ UTR downstream of the stop codon of a renilla gene inserted at a predefined locus by site-directed (Flp) recombinase16. luciferase gene (Supplementary Fig. 12a). Both unspliced and spliced After 24 or 48 h of GFP-myc–TDP-43–HA induction, a substantial 3′ UTRs were determined to be present in brain RNAs from mouse and reduction of endogenous TDP-43 protein was observed, accompanied human CNSs (Supplementary Fig. 12a). Both variants, as well as an by accumulation of a shorter, ~30-kDa product (Fig. 5d) recognized unaltered luciferase reporter, were transfected into HeLa cells along with by four different TDP-43–specific antibodies. Although this ~30-kDa plasmids driving either increased TDP-43 expression or red fluorescent band could be derived from the transgene encoding TDP-43, it was protein (RFP; Supplementary Fig. 12b). Increased levels of TDP-43 pronot recognized by antibodies to myc or HA and its size is compatible tein led to a significant reduction of luciferase produced from the gene with the endogenous TDP-43 isoform 3. Using qRT-PCR with primers carrying the long, intron-containing TDP-43 3′ UTR when compared with spanning the exon junctions of TDP-43 isoform 3, we found a ~100-fold the short or unrelated 3′ UTR (P < 10–4; Fig. 5f). Moreover, co-transfection increase of the spliced isoform 3 on overexpression of GFP-myc–TDP- of the reporters with siRNAs targeting UPF1 (Supplementary Fig. 12c), 43–HA protein (Fig. 5e). an essential component that marks an NMD substrate for degradaTo test whether TDP-43 drives splicing of its pre-mRNA through bind- tion32, enhanced luciferase produced by the intron-containing 3′ UTR ing to its 3′ UTR, we cloned a long unspliced version (containing the TDP- by ~1.5-fold, indicating that UPF1-dependent degradation of this mRNA 43–binding sites) and a short spliced version (without TDP-43–binding occurred (Fig. 5f). Lastly, the endogenous spliced isoform 3 of TDP-43 –2 nature neuroscience advance online publication –4 –4 Fold change endogenous isoform 3 TDP-43 mRNA –6 –12 Percentage of control (normalized to RFU) Fold change endogenous isoform 3 TDP-43 mRNA © 2011 Nature America, Inc. All rights reserved. –3 7 articles a CLIP reads combined CLIP clusters RNA-seq Control RNA-seq TDP-43 KD was increased, not only on elevated TDP-43 expression (by transient transfection), but also on blocking of NMD, with a synergistic effect in the combined conditions (Fig. 5g). 8 KD/control = 0.7 Fus/Tls Mammalian conservation c 120 Control oligo TDP-43 KD Control oligo TDP-43 KD 100 FUS/TLS 80 GAPDH 60 40 20 0 Saline Control TDP-43 oligo KD d 2 kb CLIP reads combined CLIP clusters RNA-seq Control RNA-seq TDP-43 KD KD/control = 3 120 80 40 0 e Fold change Grn mRNA b Percentage FUS/TLS protein TDP-43 regulates expression of disease-related transcripts TDP-43 protein bound and directly regulated a variety of transcripts involved in neurological diseases (Supplementary Table 6 and Supplementary Figs. 8 and 13), including Fus/Tls and Grn (Fig. 6), which encode FUS/TLS and progranulin, mutations in which cause ALS10,11 or FTLD33,34, respectively. TDP-43 bound to the 3′ UTR of Fus/Tls mRNA and to introns 6 and 7, all of which are highly conserved between mammalian species (Fig. 6a). Gene annotation and the presence of RNA-seq reads in these introns were consistent with either an alternative 3′ UTR or intron retention. Fus/Tls mRNA and protein were reduced to approximately 40% of their normal levels (Fig. 6b,c). Progranulin mRNA, on the other hand, was markedly increased by ~3–6-fold compared with controls (Fig. 6d,e). CLIP-seq data also confirmed TDP-43 binding to two RNAs that were previously reported to be associated with TDP-43: histone deacetylase 6 (Hdac6)35 and low–molecular weight neurofilament subunit (Nefl)36. Our RNA-seq data indicate that HDAC6, which functions to promote the degradation of polyubiquitinated proteins, was reduced on TDP-43 depletion (Supplementary Fig. 14a,b), albeit to a lesser degree in vivo than previously reported in cell culture35. It has been known for many years that Nefl mRNA levels are reduced in degenerating motor neurons from individuals with ALS37. We identified TDP-43 clusters in the 3′ UTR of Nefl (Supplementary Fig. 14c). In addition, RNA-seq data confirmed that the mouse Nefl 3′ UTR was longer than annotated and Nefl mRNA levels were slightly reduced on TDP-43 depletion (Supplementary Fig. 14d). Multiple TDP-43–binding sites were also present in the premRNA from the Mapt gene encoding tau, whose mutation or altered splicing of exon 10 has been implicated in FTD38. However, neither the levels nor the splicing pattern of Mapt RNA were affected by TDP-43 depletion (Supplementary Table 6). Moreover, we identified multiple TDP-43 intronic binding sites in the Hdh transcript (Supplementary Fig. 13), which encodes huntingtin, the protein whose polyglutamine expansion causes Huntington’s disease in humans39, accompanied by cytoplasmic TDP-43 accumulations40. Moreover, Hdh levels were decreased in mouse brain on TDP-43 depletion (Supplementary Table 6). In contrast, we found no evidence for direct binding or TDP-43 regulation of the Sod1 transcript (Supplementary Table 6), whose aberrant splicing in familial ALS cases41,42 has raised the possibility that SOD1 missplicing may be involved in the pathogenesis of sporadic ALS. 5 kb Percentage FUS/TLS mRNA © 2011 Nature America, Inc. All rights reserved. Figure 6 TDP-43 regulates expression of FUS/TLS and progranulin. (a) CLIP-seq reads and clusters on Fus/Tls transcript showing TDP-43 binding in introns 6 and 7 (also annotated as an alternative 3′ UTR) and the canonical 3′ UTR. RNA-seq reads from control or TDP-43 knockdown samples (equal scales) showed a slight reduction in Fus/Tls mRNA in the TDP-43 knockdown group and expression values from RNA-seq confirmed that Fus/Tls mRNA was reduced to 70% of control on TDP-43 reduction. (b) qRT-PCR confirmed Fus/Tls mRNA downregulation on TDP-43 depletion. s.d. was calculated within each group for three biological replicas. (c) Semiquantitative immunoblot (upper panel) revealed a slight, but consistent, reduction of FUS/TLS protein in ASO-treated mice to ~70% of control levels, as quantified by densitometric analysis (lower panel). (d) CLIP-seq reads and clusters on progranulin transcript (Grn) showing a sharp binding in the 3′ UTR. RNA-seq reads from control or TDP-43 knockdown samples (equal scales) showed a significant increase in progranulin mRNA in the TDP-43 knockdown group compared with controls. (e) qRT-PCR confirmed the statistically significant increase (P < 3 × 10−4) in progranulin mRNA in samples with reduced TDP-43 levels when compared with controls. s.d. was calculated in each group for three biological replicas. 7 6 5 4 3 2 1 0 Saline Control TDP-43 oligo KD Grn DISCUSSION TDP-43 is a central component in the pathogenesis of an ever-increasing list of neurodegenerative conditions. We created a genome-wide RNA map of >39,000 TDP-43–binding sites in the mouse transcriptome and determined that levels of 601 mRNAs and splicing patterns of 965 mRNAs were altered following TDP-43 reduction in the adult nervous system. Thus, although earlier efforts have implicated TDP43 as a splicing regulator of a few candidate genes20,43,44, our RNA-seq and microarray results establish that TDP-43 regulates the largest set (512) of cassette exons thus far reported, demonstrating its broad role in advance online publication nature neuroscience © 2011 Nature America, Inc. All rights reserved. articles alternative splicing regulation. We also found that TDP-43 was required for maintenance of 42 non-overlapping, noncoding RNAs. These findings also provide a direct test for how the nuclear loss of TDP-43 widely reported in the remaining motor neurons in ALS autopsy samples1 may contribute to neuronal dysfunction, independent of potential damage from TDP-43 aggregates. The results of our experiments using in vivo reduction of TDP-43 coupled with RNA sequencing indicate that TDP-43 is crucial for sustaining levels of 239 mRNAs, including those encoding synaptic proteins, the choline acetyltransferase, and the disease-related proteins FUS/TLS and progranulin. A substantial proportion of these pre-mRNAs are directly bound by TDP43 at multiple sites in exceptionally long introns, a feature that we found most prominently in brain-enriched transcripts (Fig. 3), thereby identifying one component of neuronal vulnerability from TDP-43 loss. A plausible model for the role of TDP-43 in sustaining the levels of mRNAs derived from long pre-mRNAs is that TDP-43 binding in long introns prevents unproductive splicing events that would introduce premature stop codons and thereby promote RNA degradation. Our results thus identify a previously unknown conserved role for TDP-43 in regulating a subset of these long intron-containing brain-enriched genes. None of our evidence eliminates the possibility that TDP-43 affects RNA levels by additional mechanisms, such as through transcription regulation or by facilitation of RNA polymerase elongation, similar to what has been shown for another splicing regulator, SC35 (ref. 45). FUS/TLS is another RNA-binding protein whose mutation causes ALS10,11 and, in some rare cases, FTLD-U. Similar to TDP-43, FUS/ TLS aggregation has been observed in different neurodegenerative conditions, including Huntington’s disease and spinocerebellar ataxia (reviewed in ref. 3). We have now shown that Fus/Tls mRNA is a direct target of TDP-43 and its level is reduced on TDP-43 depletion (Fig. 6), thereby identifying a previously unknown FUS/TLS dependency on TDP-43. The latter is also true for two additional disease-relevant proteins, progranulin and its proposed receptor sortilin. Progranulin levels were sharply increased on TDP-43 reduction and splicing of sortilin was altered. In fact, TDP-43 directly binds and regulates the levels and splicing patterns of transcripts implicated in various neurologic diseases46 (Supplementary Table 6 and Supplementary Fig. 13), consistent with a broad role of TDP-43 in these conditions3. Finally, in contrast with a recent study that reported an autoregulatory mechanism for TDP-43 that is independent of pre-mRNA splicing28, our results indicate that TDP-43 acts as a splicing regulator to reduce its own expression level by binding to the 3′ UTR of its own pre-mRNA. Although there may be additional mechanisms beyond NMD28, we found that TDP-43 enhanced splicing of an alternative intron in its own 3′ UTR, thereby autoregulating its levels through a mechanism that involves splicing-dependent RNA degradation by NMD (Fig. 5). TDP-43 autoregulation occurs in the mammalian CNS, as seen by a significant reduction of endogenous TDP-43 mRNA and protein in response to the expression of a TDP-43 transgene lacking the regulatory intron, as we found and others have reported previously47,48. Both spliced and unspliced TDP-43 RNAs were found in human and mouse brain, consistent with autoregulation at normal TDP-43 levels that substantially attenuates TDP-43 synthesis through production of unstable, spliced RNA. TDP-43-dependent splicing of its 3′ UTR intron as a key component of a TDP-43 autoregulatory loop could participate in a feedforward mechanism enhancing the cytoplasmic TDP-43 aggregates that are hallmarks of familial and sporadic ALS. Following an initiating insult (for example, one that traps some TDP-43 in initial cytoplasmic aggregates), the reduction in nuclear TDP-43 levels would decrease splicing of its 3′ UTR, which would in turn produce an elevated pool of stable TDP-43 mRNA. Repeated translation of that TDP-43 mRNA would increase nature neuroscience advance online publication synthesis of new TDP-43 in the cytoplasm whose subsequent co-aggregation into the initial complexes would drive their growth. Disrupted autoregulation, by any event that lowers nuclear TDP-43, thus provides a mechanistic explanation for what may be a critical, intermediate step in the molecular mechanisms underlying age-dependent degeneration and death of neurons in TDP-43 proteinopathies. METHODS Methods and any associated references are available in the online version of the paper at http://www.nature.com/natureneuroscience/. Accession codes. Microarray CEL files and sequenced reads have been deposited at the Gene Expression Omnibus database repository and the NCBI Short Read Archive, respectively. All our data (microarray, RNA-seq and CLIP-seq) are combined as a ‘SuperSeries entry’ under one accession number GSE27394. Note: Supplementary information is available on the Nature Neuroscience website. ACKNOWLEDGMENTS The authors would like to thank members of B. Ren’s laboratory, especially Z. Ye, S. Kuan and L. Edsall for technical help with the Illumina sequencing and U. Wagner for helpful discussions, K. Clutario and J. Boubaker for technical help, as well as all of the members of the Yeo and Cleveland laboratories, M. Ares Jr for generous support, and the neuro-team of ISIS Pharmaceuticals for critical comments and suggestions on this project. M.P. is the recipient of a Human Frontier Science Program Long Term Fellowship. C.L.-T. is the recipient of the Milton-Safenowitz post-doctoral fellowship from the Amyotrophic Lateral Sclerosis Association. D.W.C. receives salary support from the Ludwig Institute for Cancer Research. S.C.H. is funded by a National Science Foundation Graduate Research Fellowship. This work was supported by grants from the US National Institutes of Health (R37 NS27036 and an American Recovery and Reinvestment Act Challenge grant) to D.W.C and partially by grants from the US National Institutes of Health (HG004659 and GM084317) and the Stem Cell Program at the University of California, San Diego to G.W.Y. AUTHOR CONTRIBUTIONS M.P., C.L.-T., J.M. and T.Y.L. performed the experiments. K.R.H., S.C.H. and T.Y.L. conducted the bioinformatics analysis. S.-C.L. developed the monoclonal TDP-43– specific antibody used for CLIP-seq and the tetracycline-inducible GFP–TDP-43– expressing HeLa cells. S.-C.L. and E.S. generated the transgenic myc–TDP-43 mice. J.P.D. and L.S. conducted the preliminary splice-junction microarray analyses. M.P., C.L.-T., E.W., C.M., Y.S., C.F.B. and H.K. conducted the antisense oligonucleotide experiments. M.P., C.L.-T., K.R.H., G.W.Y. and D.W.C. designed the experiments. M.P., C.L.-T., K.R.H., S.C.H., G.W.Y. and D.W.C. wrote the paper. COMPETING FINANCIAL INTERESTS The authors declare no competing financial interests. Published online at http://www.nature.com/natureneuroscience/. Reprints and permissions information is available online at http://www.nature.com/ reprintsandpermissions/. 1. Neumann, M. et al. Ubiquitinated TDP-43 in frontotemporal lobar degeneration and amyotrophic lateral sclerosis. Science 314, 130–133 (2006). 2. Arai, T. et al. TDP-43 is a component of ubiquitin-positive tau-negative inclusions in frontotemporal lobar degeneration and amyotrophic lateral sclerosis. Biochem. Biophys. Res. Commun. 351, 602–611 (2006). 3. Lagier-Tourenne, C., Polymenidou, M. & Cleveland, D.W. TDP-43 and FUS/TLS: emerging roles in RNA processing and neurodegeneration. Hum. Mol. Genet. 19, R46–R64 (2010). 4. Gitcho, M.A. et al. TDP-43 A315T mutation in familial motor neuron disease. Ann. Neurol. 63, 535–538 (2008). 5. Kabashi, E. et al. TARDBP mutations in individuals with sporadic and familial amyotrophic lateral sclerosis. Nat. Genet. 40, 572–574 (2008). 6. Sreedharan, J. et al. TDP-43 mutations in familial and sporadic amyotrophic lateral sclerosis. Science 319, 1668–1672 (2008). 7. Van Deerlin, V.M. et al. TARDBP mutations in amyotrophic lateral sclerosis with TDP-43 neuropathology: a genetic and histopathological analysis. Lancet Neurol. 7, 409–416 (2008). 8. Buratti, E. et al. Nuclear factor TDP-43 can affect selected microRNA levels. FEBS J. 277, 2268–2281 (2010). 9. Cooper, T.A., Wan, L. & Dreyfuss, G. RNA and Disease. Cell 136, 777–793 (2009). 10.Kwiatkowski, T.J. Jr. et al. Mutations in the FUS/TLS gene on chromosome 16 cause familial amyotrophic lateral sclerosis. Science 323, 1205–1208 (2009). 9 © 2011 Nature America, Inc. All rights reserved. articles 11.Vance, C. et al. Mutations in FUS, an RNA processing protein, cause familial amyotrophic lateral sclerosis type 6. Science 323, 1208–1211 (2009). 12.Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008). 13.Ule, J. et al. CLIP identifies Nova-regulated RNA networks in the brain. Science 302, 1212–1215 (2003). 14.Licatalosi, D.D. et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464–469 (2008). 15.Yeo, G.W. et al. An RNA code for the FOX2 splicing regulator revealed by mapping RNA-protein interactions in stem cells. Nat. Struct. Mol. Biol. 16, 130–137 (2009). 16.Ling, S.C. et al. ALS-associated mutations in TDP-43 increase its stability and promote TDP-43 complexes with FUS/TLS. Proc. Natl. Acad. Sci. USA 107, 13318–13323 (2010). 17.Zisoulis, D.G. et al. Comprehensive discovery of endogenous Argonaute binding sites in Caenorhabditis elegans. Nat. Struct. Mol. Biol. 17, 173–179 (2010). 18.Sephton, C.F. et al. Identification of neuronal RNA targets of TDP-43-containing ribonucleoprotein complexes. J. Biol. Chem. 286, 1204–1215 (2011). 19.Mili, S. & Steitz, J.A. Evidence for reassociation of RNA-binding proteins after cell lysis: implications for the interpretation of immunoprecipitation analyses. RNA 10, 1692–1694 (2004). 20.Buratti, E. et al. Nuclear factor TDP-43 and SR proteins promote in vitro and in vivo CFTR exon 9 skipping. EMBO J. 20, 1774–1784 (2001). 21.Chi, S.W., Zang, J.B., Mele, A. & Darnell, R.B. Argonaute HITS-CLIP decodes microRNA-mRNA interaction maps. Nature 460, 479–486 (2009). 22.Parkhomchuk, D. et al. Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res. 37, e123 (2009). 23.Pang, K.C. et al. RNAdb—a comprehensive mammalian noncoding RNA database. Nucleic Acids Res. 33, D125–D130 (2005). 24.Wang, E.T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008). 25.Hu, F. et al. Sortilin-mediated endocytosis determines levels of the frontotemporal dementia protein, progranulin. Neuron 68, 654–667 (2010). 26.Carrasquillo, M.M. et al. Genome-wide screen identifies rs646776 near sortilin as a regulator of progranulin levels in human pasma. Am. J. Hum. Genet. 87, 890–897 (2010). 27.Du, H. et al. Aberrant alternative splicing and extracellular matrix gene expression in mouse models of myotonic dystrophy. Nat. Struct. Mol. Biol. 17, 187–193 (2010). 28.Ayala, Y.M. et al. TDP-43 regulates its mRNA levels through a negative feedback loop. EMBO J. 30, 277–288 (2011). 29.Le Hir, H., Moore, M.J. & Maquat, L.E. Pre-mRNA splicing alters mRNP composition: evidence for stable association of proteins at exon-exon junctions. Genes Dev. 14, 1098–1108 (2000). 30.Wollerton, M.C., Gooding, C., Wagner, E.J., Garcia-Blanco, M.A. & Smith, C.W. Autoregulation of polypyrimidine tract binding protein by alternative splicing leading to nonsense-mediated decay. Mol. Cell 13, 91–100 (2004). 10 31.Sureau, A., Gattoni, R., Dooghe, Y., Stevenin, J. & Soret, J. SC35 autoregulates its expression by promoting splicing events that destabilize its mRNAs. EMBO J. 20, 1785–1796 (2001). 32.Perlick, H.A., Medghalchi, S.M., Spencer, F.A., Kendzior, R.J. Jr. & Dietz, H.C. Mammalian orthologs of a yeast regulator of nonsense transcript stability. Proc. Natl. Acad. Sci. USA 93, 10928–10932 (1996). 33.Baker, M. et al. Mutations in progranulin cause tau-negative frontotemporal dementia linked to chromosome 17. Nature 442, 916–919 (2006). 34.Cruts, M. et al. Null mutations in progranulin cause ubiquitin-positive frontotemporal dementia linked to chromosome 17q21. Nature 442, 920–924 (2006). 35.Fiesel, F.C. et al. Knockdown of transactive response DNA-binding protein (TDP-43) downregulates histone deacetylase 6. EMBO J. 29, 209–221 (2010). 36.Strong, M.J. et al. TDP43 is a human low molecular weight neurofilament (hNFL) mRNA-binding protein. Mol. Cell. Neurosci. 35, 320–327 (2007). 37.Bergeron, C. et al. Neurofilament light and polyadenylated mRNA levels are decreased in amyotrophic lateral sclerosis motor neurons. J. Neuropathol. Exp. Neurol. 53, 221–230 (1994). 38.Hutton, M. et al. Association of missense and 5′-splice-site mutations in tau with the inherited dementia FTDP-17. Nature 393, 702–705 (1998). 39.MacDonald, M.E. et al. A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington’s disease chromosomes. Cell 72, 971–983 (1993). 40.Schwab, C., Arai, T., Hasegawa, M., Yu, S. & McGeer, P.L. Colocalization of transactivation-responsive DNA-binding protein 43 and huntingtin in inclusions of Huntington disease. J. Neuropathol. Exp. Neurol. 67, 1159–1165 (2008). 41.Valdmanis, P.N. et al. A mutation that creates a pseudoexon in SOD1 causes familial ALS. Ann. Hum. Genet. 73, 652–657 (2009). 42.Birve, A. et al. A novel SOD1 splice site mutation associated with familial ALS revealed by SOD activity analysis. Hum. Mol. Genet. 19, 4201–4206 (2010). 43.Mercado, P.A., Ayala, Y.M., Romano, M., Buratti, E. & Baralle, F.E. Depletion of TDP 43 overrides the need for exonic and intronic splicing enhancers in the human apoA-II gene. Nucleic Acids Res. 33, 6000–6010 (2005). 44.Dreumont, N. et al. Antagonistic factors control the unproductive splicing of SC35 terminal intron. Nucleic Acids Res. 38, 1353–1366 (2009). 45.Lin, S., Coutinho-Mansfield, G., Wang, D., Pandit, S. & Fu, X.D. The splicing factor SC35 has an active role in transcriptional elongation. Nat. Struct. Mol. Biol. 15, 819–826 (2008). 46.Tollervey, J.R. et al. Characterizing the RNA targets and position-dependent splicing regulation by TDP-43. Nat. Neurosci. advance online publication, doi:10.1038/nn.2778 (27 February 2011). 47.Xu, Y.F. et al. Wild-type human TDP-43 expression causes TDP-43 phosphorylation, mitochondrial aggregation, motor deficits, and early mortality in transgenic mice. J. Neurosci. 30, 10851–10859 (2010). 48.Igaz, L.M. et al. Dysregulation of the ALS-associated gene TDP-43 leads to neuronal death and degeneration in mice. J. Clin. Invest. 121, 726–738 (2011). advance online publication nature neuroscience ONLINE METHODS CLIP-seq library preparation and sequencing. Brains from 8-week-old female C57Bl/6 mice were rapidly dissociated by forcing through a cell strainer with a pore size of 100 µm (BD Falcon) before ultraviolet crosslinking. CLIP-seq libraries were constructed as previously described15, using a custom-made mouse monoclonal antibody to TDP-43 (ref. 16, 40 µl of ascites per 400 µl of beads per sample). Libraries were subjected to standard Illumina GA2 sequencing protocol for 36 cycles. © 2011 Nature America, Inc. All rights reserved. Generation of transgenic mice. All animal procedures were conducted in accordance with the guidelines of the Institutional Animal Care and Use Committee of the University of California. cDNA containing N-terminal myc-tagged fulllength human TDP-43 were amplified and digested by SalI and cloned into the XhoI cloning site of the MoPrP.XhoI vector (ATCC #JHU-2). The resultant MoPrP.XhoI-myc-hTDP-43 construct was then digested upstream of the minimal Prnp promoter and downstream of the Prnp exon 3 using BamHI and NotI and cloned into a shuttle vector containing loxP flanking sites. The final construct was then linearized using XhoI, injected into the pro-nuclei of fertilized C56Bl6/C3H hybrid eggs and implanted into pseudopregnant female mice. Stereotactic injections of antisense oligonucleotides. We anesthetized 8–10-week-old female C57Bl/6 mice with isofluorane. Using stereotaxic guides, 3 µl of ASO solution, corresponding to a total of 75 µg or 100 µg ASOs, or saline buffer was injected using a Hamilton syringe directly into the striatum. Mice were monitored for any adverse effects for the next 2 weeks until they were killed. The striatum and adjacent cortex area were dissected and frozen at –80 °C in 1 ml Trizol (Invitrogen). Trizol extraction of RNA and protein was performed according to the manufacturer’s instructions. RNA quality and RNA-seq library preparation. RNA quality was measured using the Agilent Bioanalyzer system according to the manufacturer’s recommendations. RNA-seq libraries were constructed as described previously22. We used 8 pM of amplified libraries for sequencing on the Illumina GA2 for 72 cycles. RT and qRT-PCR. cDNA of total RNA extracted from striatum was generated using oligodT and Superscript III reverse transcriptase (Invitrogen) according to the manufacturer’s instructions. To test candidate splicing targets, RT-PCR amplification using between 24 and 27 cycles were performed from at least three mice treated with a control ASO and three mice with treated with a TDP-43 ASO. Products were separated on 10% polyacrylamide gels followed by staining with SYBR gold (Invitrogen). Quantification of the different isoforms was performed with ImageJ software (US National Institutes of Health). Intensity ratio between products with included and excluded exons were averaged for three biological replicates per group. qRT-PCR for mouse Tardbp and Fus/Tls were performed using the Express One-Step SuperScript qRT-PCR kits (Invitrogen) and the thermocycler ABI Prism 7700 (Applied Biosystems). cDNA synthesis and amplification were performed according to the manufacturer’s instruction using specific primers and 5′ FAM, 3′ TAMRA-labeled probes. The cyclophilin gene was used to normalize the expression values. qRT-PCR for all other genes tested were performed with 3–5 mice for each group (treated with saline, control ASO or ASO to TDP-43) and two technical replicates using the iQ SYBR green supermix (Bio-Rad) on the IQ5 multicolor real-time PCR detection system (Bio-Rad). The analysis was done using the IQ5 optical system software (Bio-Rad; version 2.1). Expression values were normalized to at least two of the following control genes: Actb, Actg1 and Rsp9. Expression values were expressed as a percentage of the average expression of the saline treated samples. Inter-group differences were assessed by two-tailed Student’s t test. Primers for RT-PCR and qRT-PCR were designed using Primer3 software (http://frodo.wi.mit.edu/primer3/) and sequences are available on request. Immunoblots. Proteins were separated on custom-made 12% SDS page gels and transferred to nitrocellulose membrane (Whatman) following standard protocols. Membranes were blocked overnight in Tris-buffered saline Tween-20 (TBST) and 5% non-fat dry milk (wt/vol) at 4 °C, incubated for 1 h at 20 °C with primary antibodies and then again with horseradish peroxidise–linked secondary antibodies to rabbit or mouse (GE Healthcare) in TBST with 1% milk. For primary antibodies, we used rabbit antibody to FUS/TLS (Bethyl Laboratories, nature neuroscience cat #A300-302A; 1:5,000), rabbit antibody to TDP-43 (Proteintech, cat #10782; 1:2,000), rabbit antibody to TDP-43 (Aviva System Biology, cat #ARP38942_T100; 1:2,000), custom-made mouse antibody to TDP-43 (1:1,000)16, custom-made rabbit antibody to RFP raised against the full-length protein (1:7,000), mouse DM1α antibody to tubulin (1:10,000) and mouse antibody to GAPDH (Abcam, cat #AB8245; 1:10,000). Cells, cloning and luciferase assays. HeLa Flp-In cells expressing GFP-myc– TDP-43–HA were generated as previously described16. Isogenic cell lines were grown at 37 °C and 5% CO2 in Dulbecco’s modified Eagle medium (DMEM), supplemented with 10% tetracycline-free fetal bovine serum (vol/vol) and penicillin/streptomycin. Expression of GFP-myc–TDP-43–HA was induced with 1 mg ml–1 tetracycline for 24–48 h. To assess the mechanism of TDP-43 autoregulation, we cloned the proximal part of mouse TDP-43 3′ UTR into the psiCHECK-2 vector (Promega), which contains a Renilla luciferase and a Firefly luciferase reporter expression cassettes. The following primers were used to amplify 1.7 kb of TDP-43 3′ UTR using cDNA from mouse brain: 5′-aaa CTC GAG cag gct ttt ggt tct gga aa-3′ and 5′-aaa gcg gcc gcA CCA TTT TAG GTG CGG TCA C-3′. We obtained two products of 1.7 kb and 1.1 kb, corresponding, respectively, to an unspliced and a spliced isoform of TDP-43 3′ UTR (Supplementary Fig. 13a). Both products were purified on 1% agarose gels and cloned independently in the psiCHECK-2 vector using NotI and XhoI restriction sites located 3′ to the Renilla luciferase translational stop. Given that binding sites for TDP-43 mainly lie in the alternative intron, the spliced isoform was used as a control to assess the effect of TDP-43 protein on its own RNA. Human myc-TDP-43-HA cDNA (a generous gift from C. Shaw, King’s College London) was cloned into mammalian expression vector, pCl-neo (Promega). RFP was cloned in the vector pcDNA3 (Invitrogen). 250 ng of psiCHECK-2 vector with or without TDP-43 3′ UTR and 250 ng of the vector expressing TDP-43 or RFP were co-transfected in Hela cells using Fugene 6 transfection reagent (Roche) in 12-wells cell culture plates. Luciferase assays were performed 48 h after transfection using the Dual-Luciferase Reporter 1000 assay system (Promega) according to the manufacturer’s instructions. Five independent experiments were performed and 20 µl of lysate were used in duplicate for each condition. RFU for Renilla luciferase were normalized to firefly luciferase to control for transfection efficiency. Duplicates were averaged and each condition was expressed as a percentage of the samples without transfection of TDP-43 or RFP cDNA. Inter-group differences were assessed by two-tailed Student’s t test. Mouse gene structure annotations. The mouse genome sequence (mm8) and annotations for protein-coding genes were obtained from the UCSC Genome Browser. Known mouse genes (knownGene containing 31,863 entries) and known isoforms (knownIsoforms containing 31,014 entries in 19,199 unique isoform clusters) with annotated exon alignments to the mouse genomic sequence were processed as follows. Known genes that were mapped to different isoform clusters were discarded. All mRNAs aligned to mm8 that were greater than 300 nucleotides were clustered together with the known isoforms. For the purpose of inferring alternative splicing, genes containing fewer than three exons were removed from further consideration. A total of 1.9 million spliced ESTs were mapped onto the 16,953 high-quality gene clusters to identify alternative splicing events. Final annotated gene regions were clustered together so that any overlapping portion of these databases was defined by a single genomic position. Promoter regions were arbitrarily defined as 1.5 kb upstream of the transcriptional start site of the gene and intergenic regions as unannotated regions in the genome. To identify 5′ and 3′ UTRs, we relied on the coding annotation in UCSC Genome Browser known genes that we extended 1.5 kb downstream or upstream the start and stop codons, respectively. Comparison to human alternatively spliced exons. To find evidence for conserved alternative splicing patterns in humans, we used the UCSC LiftOver tool to obtain the corresponding human coordinates of the mouse exons that had evidence for differential splicing either from RNA-seq or splicing-sensitive microarray data. These human genome coordinates were then compared with human gene structure annotations constructed analogously to mouse annotations as described above to determine whether the exon in the human ortholog was alternatively or constitutively spliced based on transcript evidence in human EST or mRNA databases49. doi:10.1038/nn.2779 © 2011 Nature America, Inc. All rights reserved. Computational identification of CLIP-seq clusters. CLIP-seq reads were trimmed to remove adaptor sequences and homopolymeric runs, and mapped to the repeat-masked mouse genome (mm8) using the bowtie short-read alignment software (version 0.12.2) with parameters -q -p 4 -e 100 -y -a -m 10 --best --strata (-q for fastq file, -p for cpus, -e for mismatch threshold, -y to do a sensitive search, -a to report all alignments, --best for ordered output, --strata report only the best group of alignments), incorporating the base-calling quality of the reads. To eliminate redundancies created by PCR amplification, we considered all reads with identical sequence to be a single read. Significant clusters were calculated by first determining read number cutoffs using the Poisson distribution, λ k e−λ f ( k ; λ) = , where λ is the frequency of reads mapped over an interval of k! nucleotide sequence, k is the number of reads being analyzed for significance and f(k;λ) return the probability that exactly k reads would be found. For any desired P value, P cutoff, a read number cutoff was calculated by summing the probabilities for finding k or more tags, and determining the minimum value of k that satisfies i = k such that f(k;λ) > P cutoff. The frequency λ was calculated by dividing the total number of mapped reads by the number of non-overlapping intervals present in the transcriptome. The interval size was chosen on the basis of the average size of the CLIP product, which includes only the selected RNA fragments, but not any ligated adapters (150 bp). A global and a local cutoff were determined using the whole transcriptome frequency or gene-specific frequency, respectively. The gene-specific frequency was the number of reads overlapping that gene divided by the pre-mRNA length. A sliding window of 150 bp was used to determine where the read numbers exceed both the global and local cutoffs. At each significant interval, we attempted to extend the region by adding in the next read, ensuring continued significance at the same P cutoff. Log curve analysis to determine target saturation of CLIP-derived clusters. We used a curve-fitting approach with various sampling rates to estimate the number of CLIP-seq reads required to discover additional CLIP clusters. The target rate was calculated by determining the new set of CLIP-derived gene targets found at each step-wise increase in the number of sequenced reads. A scatter plot of targets found versus sampling rate was fitted with a log curve, which was then used to extrapolate the number of targets expected to be found by increasing read counts. Splicing-sensitive microarray data analysis. Microarray data analysis was performed as previously described, selecting significant, high-quality events using a q value > 0.05 and an absolute separation score (sepscore) > 0.5 (ref. 27). The equation for sepscore = log2[TDP-43 depleted (Skip/Include)/ Control treated (Skip/Include)]. For each replicate set, the log2 ratio of skipping to inclusion was estimated using robust least-squares analysis. Previously published work using similar cutoffs have validated about 85% of splicing events by RT-PCR27. Events on the array were defined in the mm9 genome annotation, and for proper comparison to RNA-seq and CLIP-seq data, cassette events were converted to the mm8 genome annotation using the UCSC LiftOver tool. If an event did not exactly overlap an mm8 annotated exon, it was left out of further analyses. Genomic analysis of CLIP clusters. Two biologically independent TDP-43 CLIPseq libraries were generated and sequenced on the Illumina GA2. Subjecting reads from each library to our cluster-identification pipeline described above defined 15,344 and 30,744 clusters for CLIP experiments 1 and 2, respectively. A gene was considered to be a TDP-43 target that overlapped in both experiments if it contained an overlapping cluster. We generated ten randomly distributed cluster sets and compared each to the original clusters. To compute the significance of the overlap, we calculated the standard or Z-score as follows: (percent overlap in the two experiments – mean percent overlap in ten randomly distributed cluster sets)/ s.d. of percent overlap in the randomly distributed cluster sets. A P value was computed from the standard normal distribution and assigned significance if it was lower than P < 0.01. To generate the final set of TDP-43 CLIP-seq clusters, unique reads from both experiments were combined and subjected to our clusteridentification pipeline. Overall, clusters were at most 300 bases in length, with the median of 142 bases. As a comparison to a published HITS-CLIP/CLIP-seq dataset of a RNA binding protein in mouse brain, we downloaded Ago-HITS-CLIP reads (Brain[A–E]_130_50_fastq.txt) from http://ago.rockefeller.edu/rawdata. php and subjected the combined 1,651,104 reads to our cluster-identification and generated 33,390 clusters in 7,745 genes21. To identify enriched motifs in cluster regions, Z scores were computed for hexamers as previously published15. doi:10.1038/nn.2779 Functional annotation analysis. We used the Database for Annotation, Visualization and Integrated Discovery (DAVID Bioinformatic Resources 6.7; http://david.abcc.ncifcrf.gov/). For all genes downregulated or upregulated on TDP-43 depletion, a background corresponding to all genes expressed in brain was used. Transcriptome and splicing analysis. Strand-specific RNA-seq reads each from control oligonucleotide, saline- and TDP-43 oligonucleotide–treated animals were generated and ~50% mapped uniquely to our annotated gene structure database using the bowtie short-read alignment software (version 0.12.2, with parameters -m 5 -k 5 --best --un --max -f) incorporating the base-calling quality of the reads. To eliminate redundancies created by PCR amplification, all reads with identical sequence were considered single reads. The expression value of each gene was computed by RPKM. Each RNA-seq sample was summarized by a vector of RPKM values for every gene and pair-wise correlation coefficients were calculated for all replicates using a linear least-squares regression against the log RPKM vectors. Hierarchical clustering revealed that the three treated conditions clustered into similar groups. The reads in each condition were combined to identify genes that were significantly up- and downregulated on TDP-43 depletion. Local mean and s.d. were calculated for the nearest 1,000 genes, as determined by log average RPKM values between knockdown and control and a local Z score was defined. The resulting Z scores were used to assign significantly changed genes (Z > 2 upregulated, Z < –2 downregulated), as well as ranking the entire gene list for relative expression changes. Specific parameters, such as intron length or TDP-43–binding sites, were plotted for the next 100 genes. Exons with canonical splice signals (GT-AG, AT-AC, GC-AG) were retained, resulting in a total of 190,161 exons. For each protein-coding gene, the 50 bases at the 3′ end of each exon were concatenated with the 50 bases at the 5′ end of the downstream exon producing 1,827,013 splice junctions. An equal number of ‘impossible’ junctions was generated by joining the 50-base exon junction sequences in reverse order. To identify differentially regulated alternative cassette exons, we employed a modification of a published method24. In short, the read count supporting inclusion of the exon (overlapping the cassette exon and splice junctions including the exon) were compared with the read count supporting exclusion of the exon (overlapping the splice junction of the upstream and downstream exon. For a splice junction read to be enumerated, we required that at least 18 nucleotides of the read aligned and 5 bases of the read extended over the splice junction with no mismatches 5 bases around the splice junction. For the TDP-43 and control ASOs comparison, we constructed a 2 × 2 contingency table using of the counts of the reads supporting the inclusion and exclusion of the exon, in two conditions. Every cell in the 2 × 2 table had to contain at least five counts for a χ2 statistic to be computed. At P < 0.01, 110 excluded and 93 included single cassette exons were detected to be differentially regulated by TDP-43. As an estimate of false discovery, we observed that ~20 single cassette exons were detected by using the impossible junction database. Pre-mRNA features and tissue-specificity analysis. Affymetrix microarray data representing 61 mouse tissues were downloaded from the Gene Expression Omibus repository (http://www.ncbi.nih.gov/geo) under accession number GSE1133 (ref. 50). Probes on the microarrays were cross-referenced to 15,541 genes in our database using files downloaded from the UCSC Genome Browser (knownToGnf1m and knownToGnfAtlas2). The expression value for each gene was represented by the average value of the two replicate microarray experiments for each tissue. To identify genes enriched in brain, we grouped 13 tissues as ‘brain’ (substantia nigra, hypothalamus, preoptic, frontal cortex, cerebral cortex, amygdala, dorsal striatum, hippocampus, olfactory bulb, cerebellum, trigeminal, dorsal root ganglia, pituitary) and the remaining 44 tissues as ‘non-brain’, excluding the five embryonic tissues (embryonic days 10.5, 9.5, 8.5, 7.5 and 6.5). − µnon-brain ) (µ , where For each gene, the t statistic was computed as t = brain 2 2 s brain + s non-brain mbrain(sbrain) and mnon-brain(snon-brain) were the average (s.d.) of the gene expression values in brain and non-brain tissues, respectively. At a t statistic value cutoff of ≥1.5, 388 genes were categorized as being brain enriched. Concurrently, at a cutoff of <1.5, 15,153 genes were categorized as being non-brain enriched. Random sets of 388 genes were selected from the 15,541 genes as controls for the brain-enriched set. To determine whether pre-mRNA features were significantly different in the set of brain-enriched genes (or randomly chosen genes) nature neuroscience ‘brain’ (temporal lobe, globus pallidus, cerebellum peduncles, cerebellum, caudate nucleus, whole brain, parietal lobe, medulla oblongata, amygdala, prefrontal cortex, occipital lobe, hypothalamus, thalamus, subthalamic nucleus, cingulated cortex, pons, fetal brain), and the remaining 62 tissues as ‘non-brain’. At the same t statistic cutoffs, 387 and 17,985 genes were categorized as brain enriched and non-brain enriched, respectively. Random sets of 387 genes were selected from the 18,372 genes as controls for the brain-enriched set. 49.Yeo, G.W., Van Nostrand, E.L. & Liang, T.Y. Discovery and analysis of evolutionarily conserved intronic splicing regulatory elements. PLoS Genet. 3, e85 (2007). 50.Su, A.I. et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci. USA 101, 6062–6067 (2004). © 2011 Nature America, Inc. All rights reserved. compared with non-brain–enriched genes, we performed the two-sample Kolmogorov-Smirnov goodness-of-fit hypothesis test, which determines whether the distribution of features were drawn from the same underlying continuous population. Cumulative distribution plots of pre-mRNA features were generated to illustrate the differences. Affymetrix microarray data representing 79 human tissues were downloaded from the Gene Expression Omibus repository under the same accession number GSE1133 (ref. 50). Probes on the human microarrays were cross-referenced to 18,372 genes in our database using files downloaded from the UCSC Genome Browser (knownToGnfAtlas2 – hg18). The same analysis as done for the mouse array data was repeated for these human array data. We grouped 17 tissues as nature neuroscience doi:10.1038/nn.2779 PolymenidouetalSUPPLEMENTARYINFORMATION Long pre‐mRNA depletion and RNA missplicing contribute to neuronal vulnerabilityfromlossofTDP‐43 Polymenidou M*, Lagier‐Tourenne C*, Hutt KR*, Huelga SC, Moran J, Liang TY, Ling SC, Sun E, Wancewicz E, MazurC,KordasiewiczH,SedaghatY,DonohueJP,ShiueL,BennettCF,YeoGWandClevelandDW Supplementary Figure 1 TDP‐43 CLIP‐seq method, reproducibility and saturation of binding sites. (a) Schematic representation of cross‐linking, immunoprecipitation and high‐throughput sequencing protocol (CLIP‐seq). (b) Autoradiographs of CLIP‐seq experiments performed in parallel using two different antibodies, a commercially available rabbit polyclonal TDP‐43 antibody (Aviva Systems Biology, Cat. No. ARP38942_T100) left and an in‐house monoclonal antibody recognizing an epitope within aminoacids251‐414ofhuman TDP‐4316. Under identical conditionsandexposuretimes, our monoclonal antibody showedamuchhigherpotency of immunoprecipitating TDP‐ 43‐RNAcomplexes.Theorange box depicts the excised low mobility complexes used for the cDNA libraries for comparison to the monomeric TDP‐43 complexes. (c) Reproducibility of CLIP‐seq experimentsonagenome‐wide scale. Clusters identified in independent CLIP‐seq experiments were considered tooverlapif1baseofacluster in one experiment extended to a cluster in the other experiment(leftpanel).Agene containing an overlapping cluster was considered to overlap (right panel). Venn diagrams revealed statistically significant overlap between clusters and genes in replicate CLIP‐seq libraries, with randomly distributed clusters shown as a comparison (lower panel). (d) Gene target saturation chart plotting gene targets found by cluster‐finding on increasingsamplingrates(5%intervals).DatafollowsalogarithmicexpansionwithanR2valueof0.98. natureneuroscience Nature Neuroscience: doi:10.1038/nn.2779 1 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Figure 2 Cluster strength and gene-location biases of TDP-43 binding. (a) Box plots of cluster strength (by χ² analysis) versus GUGU-cluster bins (counts of non-overlapping GUGU tetramer found within the cluster). By t-test comparison to all clusters, bins of 11-15 and 16+ GUGUs were significantly enriched (p < 0.01 and p < 10-37, respectively). On each box, the central mark is the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the most extreme data points not considered outliers, and outliers are plotted individually (b) Pre-mRNAs were divided into annotated 5′ and 3′ untranslated regions, exons, proximal introns (<0.5kb from a splice junction), distal introns (>0.5kb from a splice junction) and unannotated regions. Pie-charts revealed that the majority of TDP-43 binding sites were located in distal intronic regions of pre-mRNA (left panel). For comparison, previously published Argonaute (Ago) binding sites in mouse brain21 are displayed as analogous pie-charts (right panel). (c) CLIP binding sites of TDP-43 in the two replicates, Ago, and random TDP-43 clusters across genes. The fraction of CLIP clusters was plotted depending on the relative gene position from the 5’ to the 3’ ends of each target gene. Supplementary Figure 3 RNA-seq biological replicas. Correlation between RNA-seq results obtained from mice subjected to different ASO treatments. Heatmap generated from Cluster3 and Matlab, using linear least-squares regression correlation coefficients of the pair-wise comparison between RPKM values of all genes from each experiment. “K” is TDP-43 knockdown samples, “C” is control oligo-, “S” is salinetreated samples, and numbers represent biological replicates. nature neuroscience Nature Neuroscience: doi:10.1038/nn.2779 2 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Figure 4 Correlation between RNA-seq and CLIP-seq data using various parameters. (a) Genes were ranked randomly instead of upon their degree of regulation after TDP-43 depletion and the average TDP-43 CLIP clusters found in introns of the next 100 genes in the ranked list were plotted (green line). Similarly, the median of total intron length for the next 100 genes was plotted (blue line). (b) Genes were ranked by ascending expression (RPKM) values in the samples treated with control ASO. (c) Genes were ranked by ascending expression (RPKM) values in the samples treated with TDP-43 ASO. (d) Genes were ranked upon their degree of regulation after TDP-43 depletion and average Ago clusters found in introns of the next 100 genes (green line) or the total intron length for the next 100 genes (blue line) were plotted. (e) Genes were ranked upon their degree of regulation after TDP-43 depletion and average clusters found in 5’UTR for the next 100 genes (green line) or the total 5’UTR length for the next 100 genes (blue line) were plotted. (f) Genes were ranked upon their degree of regulation after TDP-43 depletion and average clusters found in 3’UTR for the next 100 genes (green line) or the total 3’UTR length for the next 100 genes (blue line) were plotted. (g) Genes were ranked upon their degree of regulation after TDP-43 depletion and average clusters found in exons for the next 100 genes (green line) or the total exon length for the next 100 genes (blue line) were plotted. No correlation between CLIP and RNAseq data was observed (a-f). (h) Genes were ranked upon their degree of regulation after TDP-43 depletion and average clusters found in introns for the next 100 genes (green line) or the average intron length for the next 100 genes (blue line) were plotted. (i) Genes were ranked upon their degree of regulation after TDP-43 depletion and average cluster density (cluster count/intron length) for the next 100 genes was plotted. nature neuroscience Nature Neuroscience: doi:10.1038/nn.2779 3 PolymenidouetalSUPPLEMENTARYINFORMATION Supplementary Figure 5 The calcium signaling pathway is affected by TDP‐43 depletion. (a) The calcium‐signaling pathwayannotatedbyKEGG(KyotoEncyclopediaofGenesandGenomes)wasfoundtobesignificantusingthefunctional annotationtoolDAVID.Red‐borderedgenegroupscontainTDP‐43targetsthataredownregulateduponTDP‐43depletion. (b)ListofdownregulatedTDP‐43targetscontributingtocalcium‐signalingpathway Nature Neuroscience: doi:10.1038/nn.2779 natureneuroscience 4 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Figure 6 The long-term potentiation pathway is affected by TDP-43 depletion. (a) The long-term potentiation pathway annotated by KEGG was found to be significant using the functional annotation tool DAVID. Redbordered gene groups contain TDP-43 targets that are downregulated upon TDP-43 depletion. (b) List of downregulated TDP-43 targets contributing to long-term potentiation pathway. nature neuroscience Nature Neuroscience: doi:10.1038/nn.2779 5 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Figure 7 Representative examples of genes with long introns and multiple TDP-43 binding sites. (a) Most transcripts in this group (mean intron length >10kb) displayed multiple intronic TDP-43 binding sites as determined by CLIP-seq (purple bars above each gene). In contrast, Chat is an example of a downregulated gene with short intron length (mean intron length <10kb) that shows a single TDP-43 binding site. The number of clusters per gene (n) is in purple on the left. Note that each cluster represents a binding cluster defined by a collection of overlapping reads as shown for Park2 (inset). (b) Examples of conservation of gene structures in human neurexin 3 and parkin 2 showing similar genic architectures as their mouse orthologues. nature neuroscience Nature Neuroscience: doi:10.1038/nn.2779 6 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Figure 8 TDP-43 binding on the 3′UTR of RNAs represses gene expression. (a) Normalized expression (based on RPKM values from RNA-seq) of selected up-regulated genes upon TDP-43 depletion that also contain TDP-43 binding regions in their 3’UTR. (b) Enrichment for binding within the 3’UTR among upregulated TDP-43 targets, compared to unchanged genes. *Upregulated with intronic binding versus unchanged with intronic binding (p<1.8x10-6), **Upregulated with 3’UTR binding versus unchanged with 3’UTR binding (p<8.2x10-3), ***Upregulated with 3’UTR binding versus downregulated with 3’UTR binding (p < 1.4 x 10-2). P-values calculated by chi-square. Supplementary Figure 9 Splicing changes upon TDP-43 depletion identified by exonjunction arrays. Distribution of alternative splicing events upon TDP-43 depletion detected using splicing-sensitive microarrays. Of the 326 cassette events defined by Affymetrix on mm9 annotation, 287 were annotated by our stringent gene structure annotations in mm8, and were used for comparisons to RNA-seq and CLIPseq data. nature neuroscience Nature Neuroscience: doi:10.1038/nn.2779 7 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Figure 10 Examples of alternative splicing regulation by TDP-43. RNA targets with exons included (a) or excluded (b) upon TDP-43 knockdown (same as in Fig. 4d). Semi-quantitative RT-PCR of selected targets showing splice changes in samples with TDP-43 knockdown compared to controls. Schematic representation of assessed exons with the changed exon(s) colored orange (left). The exon number of the gene is indicated below each box. The red arrows depict the position of the primers used for RT-PCR. Purple boxes indicate TDP-43 binding sites as defined by CLIP-seq. Representative acrylamide gel pictures of RT-PCR products from control or knockdown adult brain samples, as indicated (right panel). Quantification of splicing changes from three biological replicas per group is shown in the middle panel. nature neuroscience Nature Neuroscience: doi:10.1038/nn.2779 8 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Figure 11 Splicing changes correlate with the level of TDP-43 knockdown. (a) Acrylamide gels with semiquantitative RT-PCR samples showing sortilin 1 (Sort1) splicing changes in mice with various levels of TDP-43 reduction (shown as percentages above each bar). (b) Significant correlation of TDP-43 levels to Sort1 splicing changes (correlation value = -0.905, n=27). (c) Acrylamide gels with semi-quantitative RT-PCR samples showing Kcnip splicing changes in mice with various levels of TDP-43 reduction (shown as percentages above each bar on the lowest panel). (D-E) Significant correlation of TDP-43 levels to Kcnip splicing changes (correlation value = 0.91 for exon 3 and 0.83 for exon 2 and 3, n=12). nature neuroscience Nature Neuroscience: doi:10.1038/nn.2779 9 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Figure 12 TDP-43 3’UTRs used in luciferase assays and control immunoblots. (a) Two alternative 3’UTRs of TDP-43 were amplified from mouse or human cDNA using primers whose approximate locations are shown by the small red arrows below the schemes on the right. The first 1.7kb or 1.1kb of each long (unspliced) or short (spliced) 3’UTR respectively were then cloned into a renilla luciferase reporter vector. The latter also contains the firefly luciferase gene that is expressed independently from renilla luciferase and serves as a transfection normalization control. (b) Representative immunoblots of HeLa cell lysates used for luciferase assays in Fig. 5F showing increased expression of TDP-43 in samples transfected with a myc-hTDP-43-HA expressing vector. For control, cells were transfected with a vector expressing an unrelated protein, namely red fluorescence protein (RFP), whose levels were confirmed by immunoblot (second panel). (c) Representative immunoblots of HeLa cell lysates used for Fig. 5F,G showing decreased levels of UPF1 in samples transfected with UPF1 siRNA. nature neuroscience Nature Neuroscience: doi:10.1038/nn.2779 10 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Figure 13 TDP-43 binding and regulation of Hdac6 and Nefl transcripts. (a) CLIP-seq reads and clusters on Hdac6 transcript showing intronic TDP-43 binding. RNA-seq reads from control or TDP-43 knockdown samples (equal scales) show a slight decrease in Hdac6 mRNA in the TDP-43 knockdown group. (b) Quantitative RT-PCR confirmed the slight decrease in Hdac6 mRNA upon TDP-43 depletion. Standard deviation was calculated within each group for 3 biological replicas. (c) CLIP-seq reads and clusters on Nefl mRNA showing two distinct TDP-43 binding sites within the 3’UTR. Our RNA-seq reads indicate that the 3’UTR of mouse Nefl is longer than reported in UCSC database (light blue bar indicates the extention). RNA-seq reads from control or TDP-43 knockdown samples show a slight decrease in Nefl mRNA in the TDP-43 knockdown group, in accordance with previous report. nature neuroscience Nature Neuroscience: doi:10.1038/nn.2779 11 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Figure 14 TDP-43 binding patterns on transcripts encoding for selected disease-related proteins. The number of clusters per gene (n) is in purple on the left. Note that each cluster represents a binding site defined by a collection of overlapping reads as shown for ataxin 1 (inset). nature neuroscience Nature Neuroscience: doi:10.1038/nn.2779 12 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Table 1 Genes upregulated upon TDP-43 depletion in adult mouse brain Number of CLIP-seq clusters RNA-seq Total in 5'UTR in introns Oasl2 0 0 0 0 0 29.97 2.86 10.49 11.66 Ifit1 0 0 0 0 0 23.26 2.11 11.03 11.23 Gene symbol Protein Refseq/mRNA identifier BC034361 Oasl2 protein Interferon-induced protein with NM_008331 tetratricopeptide repeats 1 (IFIT-1) Interferon-induced protein with NM_010501 Ifit3 tetratricopeptide repeats 3 (IFIT-3) NM 198095 Bst2 DAMP-1 protein 5830458K16Rik Receptor transporting protein 4 NM 023386 Signal transducer and activator of AK046517 Stat1 transcription 1 Glycoprotein (transmembrane) nmb AK079220 Gpnmb Lectin, galactoside-binding, soluble, 3 NM_011150 Lgals3bp binding Interferon induced transmembrane protein Ifitm3 NM_025378 3 Transgelin 2 BC049861 Tagln2 Interferon-induced protein with AK077243 Ifit3 tetratricopeptide repeats 3 (IFIT-3) Interferon inducible protein 1 AK002545 Irgm Vim Vimentin NM 011701 C3 Complement component 3 NM 009778 Gbp2 Guanylate nucleotide binding protein 2 NM 010260 Cd5l CD5 antigen-like NM 009690 Cd44 CD44 antigen isoform b NM 001039150 Serine (or cysteine) proteinase inhibitor, Serping1 NM_009776 clade NM 009779 C3ar1 Complement component 3a receptor 1 in RPKM in in 3'UTR exons TDP-43 KD RPKM in Ratio Z score control KD/control 0 0 0 0 0 20.03 3.07 6.53 8.15 0 0 0 0 0 0 0 0 0 0 16.05 10.03 3.21 1.44 5.00 6.98 6.96 6.86 0 0 0 0 0 20.59 4.52 4.56 6.81 0 0 0 0 0 70.72 16.86 4.20 6.62 0 0 0 0 0 73.81 17.96 4.11 6.61 0 0 0 0 0 30.74 8.52 3.61 6.58 0 0 0 0 0 15.17 3.62 4.19 6.18 1 0 0 0 1 9.91 1.72 5.75 6.15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14.53 111.40 8.71 7.49 6.39 12.36 3.68 31.33 1.57 1.33 0.98 2.87 3.95 3.56 5.56 5.61 6.50 4.30 6.02 5.92 5.91 5.89 5.76 5.69 0 0 0 0 0 14.77 3.96 3.73 5.65 0 0 0 0 0 16.68 4.72 3.53 5.40 AF220014 0 0 0 0 0 8.96 1.91 4.69 5.39 NM 007807 AK171067 NM 175628 NM 029803 NM 008175 AK046736 NM 007631 NM 010705 NM 007669 NM 011909 NM 010259 NM 011163 BC003792 NM 009263 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 5.49 22.69 11.32 4.29 70.17 11.26 33.11 12.07 26.03 4.11 6.55 11.99 11.28 35.34 0.86 7.54 2.76 0.55 23.12 2.78 11.12 3.31 9.17 0.57 1.40 3.26 2.94 12.35 6.36 3.01 4.10 7.76 3.04 4.05 2.98 3.65 2.84 7.25 4.67 3.68 3.84 2.86 5.32 5.29 5.25 5.22 5.22 5.20 5.19 5.11 5.05 5.05 5.05 5.05 5.00 4.93 AK159572 0 0 0 0 0 44.59 16.67 2.67 4.53 Cd68 Cytochrome b-245, beta polypeptide CD84 antigen Alpha-2-macroglobulin Interferon stimulated gene 12(b1) Granulin Novel TRAF-type zinc finger protein Cyclin D1 Galectin 3 Cyclin-dependent kinase inhibitor 1A Ubiquitin specific protease 18 Guanylate nucleotide binding protein 1 Protein kinase, interferon-inducible double Apolipoprotein B mRNA editing enzyme Secreted phosphoprotein 1 TYRO protein tyrosine kinase binding protein Lectin, galactose binding, soluble 9 Schlafen 8 Cystatin F (leukocystatin) Insulin-like growth factor I precursor (IGF-I) (Somatomedin) Transglutaminase 1, K polypeptide Fc receptor, IgE, high affinity I, gamma DEAD (Asp-Glu-Ala-Asp) box polypeptide 58 Interferon dependent positive acting Complement component 1, q subcomponent, A chain H2-K1 protein B aggressive lymphoma CD52 antigen H-2 class I histocompatibility antigen, D-B alpha chain precursor (H- 2D(B)) Macrosialin precursor (CD68 antigen) BC021637 0 0 0 0 0 38.54 14.73 2.62 4.51 S100a6 S100 calcium binding protein A6 (calcyclin) NM_011313 0 0 0 0 0 11.39 3.44 3.31 4.50 Trim30 Cybb Cd84 A2m Ifi27 Grn Fbxo39 Ccnd1 Lgals3 Cdkn1a Usp18 Gbp1 Eif2ak2 Apobec1 Spp1 Tyrobp Lgals9 Slfn8 Cst7 Igf1 Tgm1 Fcer1g Ddx58 Isgf3g C1qa H2-K1 Parp9 Cd52 H2-D1 Rsad2 B2m Fcgr2b Slfn2 Arpc1b Iigp2 Ly9 Sn Slfn9 Dtx3l NM 205820 Lyzs Cxcr4 Slfn5 Msn Lilrb4 Parp14 Cd48 Tripartite motif protein 30 (Down regulatory protein of interleukin 2 receptor) NM_011662 0 0 0 0 0 37.90 13.18 2.87 4.90 NM 010708 NM 181545 NM 009977 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16.47 5.31 12.31 5.34 0.97 3.60 3.09 5.45 3.42 4.90 4.84 4.83 AY878193 0 0 0 0 0 7.26 1.73 4.20 4.80 NM 019984 NM 010185 0 0 0 0 0 0 0 0 0 0 4.50 33.93 0.69 12.36 6.48 2.75 4.80 4.72 4.70 AK155930 0 0 0 0 0 6.74 1.66 4.07 NM 008394 0 0 0 0 0 15.30 5.00 3.06 4.68 NM_007572 0 0 0 0 0 137.39 50.81 2.70 4.66 BC080756 NM 030253 NM 013706 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 31.08 6.64 10.40 11.60 1.68 2.87 2.68 3.94 3.62 4.66 4.56 4.53 Viral hemorrhagic septicemia virus(VHSV) NM_021384 induced NM 009735 Beta-2-microglobulin Low affinity immunoglobulin gamma Fc NM_010187 region receptor II precursor NM 011408 Schlafen 2 Actin related protein 2/3 complex, subunit AK171364 1B Highly similar to Mus musculus interferonAK128991 gamma induced GTPase Ly9 protein BC095921 Sialoadhesin NM 011426 Schlafen 9 NM 172796 Weakly similar to B-lymphoma- and BALBC085093 associated protein (Rhysin 2) NM 205820 Toll-like receptor 13 Lysozyme C type M precursor AK159276 C-X-C chemokine receptor type 4 NM 009911 Schlafen 5 NM 183201 Moesin NM 010833 Leukocyte immunoglobulin-like receptor, NM 013532 Poly (ADP-ribose) polymerase family, NM_001039530 member 14 NM 007649 CD48 antigen Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience 0 0 0 0 0 4.28 0.70 6.12 4.49 0 0 0 0 0 139.63 54.00 2.59 4.45 0 0 0 0 0 11.96 3.85 3.11 4.42 0 0 0 0 0 7.60 2.09 3.64 4.41 0 0 0 0 0 11.69 3.81 3.07 4.32 0 0 0 0 0 8.11 2.26 3.59 4.32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9.30 3.25 3.48 2.77 0.55 0.60 3.37 5.88 5.81 4.20 4.18 4.15 0 0 0 0 0 8.50 2.56 3.32 4.11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7.47 111.97 6.74 8.63 33.37 5.25 2.23 46.71 1.94 2.60 14.04 1.29 3.36 2.40 3.47 3.32 2.38 4.07 4.11 4.11 4.09 4.08 4.08 4.07 0 0 0 0 0 6.08 1.65 3.70 4.05 0 0 0 0 0 8.11 2.49 3.26 4.04 13 Polymenidou et al Edg3 SUPPLEMENTARY INFORMATION Ly86 AK029112 Ftl1 AK141670 Plscr2 Ctss Lcn2 Tlr2 Unc93b1 Cd274 Cd9 Apoc1 Hexa Igfbp5 Anxa2 Glipr1 Pdpn Psmb8 Endothelial differentiation, sphingolipid NM 010101 Lysosomal-associated protein NM_010686 transmembrane 5 NM 030707 Macrophage scavenger receptor 2 Cellular repressor of E1A-stimulated genes NM_011804 1 Epithelial membrane protein 1 NM 010128 Heme oxygenase (decycling) 1 NM 010442 Zinc finger protein 288 BC056446 Triggering receptor expressed on myeloid NM_031254 cells NM 007651 CD53 antigen Angiogenin, ribonuclease A family, member NM_007447 1 AK079888 MS4A7 protein homolog ATP-binding cassette sub-family A member NM_013454 1 NM 022325 Cathepsin Z preproprotein E030016B05 product:embryonic epithelial AK086962 gene 1 NM 199146 Hypothetical protein LOC209387 Transporter 1, ATP-binding cassette, subNM_013683 family AK152638 Cathepsin D Oncostatin M receptor beta AB015978 Polymeric immunoglobulin receptor 3 AF251703 Cytochrome b-245, alpha polypeptide NM 007806 TAP binding protein isoform 1 NM 001025313 Complement component 1, q NM_009777 subcomponent, B chain NM 008533 CD180 antigen Interferon-induced protein with NM 008332 Lectin, galactose binding, soluble 1 NM 008495 Glial fibrillary acidic protein AK140151 Melanoma differentiation associated NM_027835 protein-5 NM 144830 Hypothetical protein LOC217203 Chemokine (C-X-C motif) ligand 5 NM 009141 Solute carrier family 15, member 3 NM 023044 Complement component 1, q NM_007574 subcomponent, gamma Lymphocyte antigen 86 NM 010745 Cybrd1 protein AK029112 Ferritin light chain 1 NM 010240 Lysosomal acid lipase 1 AK141670 Phospholipid scramblase 2 NM 008880 Cathepsin S AK028366 Lipocalin 2 NM 008491 Toll-like receptor 2 NM 011905 Unc93 homolog B NM 019449 CD274 antigen NM 021893 CD9 antigen NM 007657 Apolipoprotein C-I NM 007469 Hexosaminidase A AK159091 Insulin-like growth factor binding protein 5 NM 010518 Anxa2 protein BC004659 GLI pathogenesis-related 1 (glioma) NM 028608 Glycoprotein 38 precursor (GP38) BC026551 Proteasome subunit beta type 8 precursor BC013785 Hspb6 Laptm5 Msr2 Creg1 Emp1 Hmox1 Zbtb20 Trem2 Cd53 Ang1 Ms4a7 Abca1 Ctsz Slc43a3 AI451617 Tap1 Ctsd Osmr Cd300lf Cyba Tapbp C1qb Cd180 Ifit2 Lgals1 Gfap Ifih1 Tmem106a Cxcl5 Slc15a3 0 0 0 0 0 10.24 3.29 3.11 0 0 0 0 0 53.57 23.32 2.30 4.03 4.03 0 0 0 0 0 60.70 25.61 2.37 4.00 0 0 0 0 0 25.09 11.02 2.28 3.99 0 0 27 0 0 0 0 0 27 0 0 0 0 0 0 5.99 12.84 10.94 1.59 4.95 3.79 3.76 2.60 2.89 3.99 3.96 3.95 0 0 0 0 0 46.13 19.85 2.32 3.92 0 0 0 0 0 20.62 9.15 2.25 3.84 0 0 0 0 0 21.94 9.78 2.24 3.79 0 0 0 0 0 5.29 1.41 3.74 3.78 6 0 6 0 0 32.69 14.78 2.21 3.73 0 0 0 0 0 55.45 25.30 2.19 3.73 0 0 0 0 0 4.91 1.29 3.82 3.73 0 0 0 0 0 4.37 1.02 4.28 3.72 0 0 0 0 0 4.36 1.03 4.22 3.68 2 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 333.96 8.11 3.16 12.34 18.91 152.95 2.73 0.65 4.92 8.34 2.18 2.97 4.86 2.51 2.27 3.68 3.67 3.67 3.67 3.66 2 0 2 0 0 110.40 50.84 2.17 3.65 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9.19 9.25 9.55 98.80 3.20 3.26 3.44 46.64 2.87 2.84 2.77 2.12 3.63 3.61 3.56 3.54 0 0 0 0 0 5.25 1.57 3.35 3.53 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4.01 2.79 6.74 0.94 0.60 2.34 4.25 4.63 2.88 3.52 3.49 3.49 0 0 0 0 0 82.06 39.22 2.09 3.48 0 0 0 1 0 0 0 0 1 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16.67 4.58 33.68 18.56 3.30 107.38 4.10 5.61 20.40 4.75 44.66 3.18 34.64 128.27 7.60 3.45 15.04 3.36 7.32 1.24 16.07 8.58 0.75 52.02 1.03 1.75 9.91 1.37 21.83 0.74 16.94 63.50 2.86 0.84 6.93 0.81 2.28 3.70 2.10 2.16 4.41 2.06 3.99 3.21 2.06 3.47 2.05 4.29 2.04 2.02 2.65 4.09 2.17 4.14 3.47 3.47 3.45 3.45 3.44 3.42 3.42 3.40 3.38 3.38 3.33 3.32 3.32 3.32 3.29 3.29 3.29 3.29 Heat shock protein, alpha-crystallin-related NM_001012401 0 0 0 0 0 7.11 2.68 2.65 3.23 Ras-related protein Rab-7b Transmembrane protein 10 Tripartite motif protein 25 Histocompatibility 2, T region locus 23 Ms4a6c protein AXL receptor tyrosine kinase MHC class I Q4 beta-2-microglobulin (Qb1) Tubulin beta-5 chain (Beta-tubulin class-V) homolog AK030688 NM 153520 AK169562 NM 010398 BC062247 NM 009465 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 4.86 14.00 6.39 12.19 2.10 20.86 1.51 6.33 2.35 5.55 0.52 10.40 3.22 2.21 2.72 2.20 4.08 2.01 3.23 3.23 3.22 3.21 3.20 3.20 AK037574 0 0 0 0 0 7.52 2.85 2.64 3.19 BC008225 0 0 0 0 0 2.19 0.54 4.03 3.16 Tnfrsf1a Tumor necrosis factor receptor superfamily NM_011609 0 0 0 0 0 10.35 4.37 2.37 3.15 Lcp1 NM 008879 0 0 0 0 0 10.43 4.40 2.37 3.14 AK088605 0 0 0 0 0 8.84 3.48 2.54 3.14 NM 201226 2 0 0 1 1 35.00 17.95 1.95 3.13 NM_026819 0 0 0 0 0 42.45 21.35 1.99 3.13 Cd22 Lymphocyte cytosolic protein 1 Glycoprotein galactosyltransferase alpha 1, 3 Leucine rich repeat containing 47 Dehydrogenase/reductase (SDR family) member 1 CD22 antigen Arhgdib Rho, GDP dissociation inhibitor (GDI) beta C1qg 5430435G22Rik Tmem10 Trim25 H2-T23 Ms4a6c Axl H2-Q1 Tubb6 Ggta1 Lrrc47 Dhrs1 Slc14a1 Trim21 Nfe2l2 Fabp7 BC089618 Sfrp5 D12Ertd647e Capg Pik3ap1 Timp1 Urea transporter, erythrocyte (Urea transporter B) (UT-B) Tripartite motif protein 21 Nuclear factor erythroid 2 related factor 2 Fatty acid binding protein 7, brain Novel protein similar to extracellular proteinase inhibitor Expi Secreted frizzled-related sequence protein 5 Hypothetical protein LOC52668 isoform 2 Gelsolin-like capping protein Phosphoinositide-3-kinase adaptor protein 1 Metalloproteinase inhibitor 1 precursor Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience AK171589 0 0 0 0 0 4.48 1.36 3.29 3.12 NM_007486 0 0 0 0 0 19.72 9.76 2.02 3.09 AK153891 0 0 0 0 0 12.80 6.01 2.13 3.09 AK141294 U20532 NM 021272 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3.70 17.09 10.95 1.03 8.45 4.83 3.59 2.02 2.27 3.08 3.08 3.08 BC089618 0 0 0 0 0 2.71 0.69 3.93 3.06 NM_018780 0 0 0 0 0 3.18 0.83 3.81 3.05 NM 194066 NM 007599 0 0 0 0 0 0 0 0 0 0 16.90 3.46 8.51 0.93 1.99 3.70 3.04 3.03 NM_031376 0 0 0 0 0 5.62 1.99 2.83 3.03 BC008107 0 0 0 0 0 1.86 0.50 3.70 3.01 14 Polymenidou et al Cd63 1810029B16Rik Ctsc Gusb Stat2 SUPPLEMENTARY INFORMATION NM 007653 NM 025465 NM 009982 BC004616 AF206162 NM_030150 0 0 0 0 0 2.49 0.69 3.61 2.76 Cd74 Cd63 antigen Hypothetical protein LOC66282 Cathepsin C preproprotein Beta-glucuronidase structural Stat2 protein Poly (ADP-ribose) polymerase family, member 12 Transcobalamin 2 Fc receptor, IgG, high affinity I Colony stimulating factor 2 receptor, beta 2, low-affinity (granulocyte-macrophage) Complement component 5, receptor 1 Cd151 protein CKLF-like MARVEL transmembrane domain-containing protein 3 (Chemokinelike factor superfamily member 3) Epithelial stromal interaction 1 isoform b Peripheral benzodiazepine receptor Hypothetical protein LOC74096 Ribosomal protein L39 Scotin isoform 1 B-cell leukemia/lymphoma 3 Calponin 3, acidic Interferon gamma inducible protein 47 Adipose differentiation related protein Tumor necrosis factor receptor superfamily, Lipoma HMGIC fusion partner-like 2 Interferon-induced protein 35 Epithelial membrane protein 3 Transcription factor 7-like 2 Hypothetical Peptidase family M20/M25/M40 containing protein Transglutaminase 2, C polypeptide Legumain Uncoupling protein 2 Elastase 1, pancreatic DNA segment, Chr 11, Lothar Hennighausen 2, Ia-associated invariant chain NM 010545 0 0 0 0 0 7.68 3.34 2.30 2.73 Tgfbr2 Transforming growth factor, beta receptor II NM_009371 0 0 0 0 0 10.18 4.75 2.14 2.73 Rac2 Apoe Olig2 Anxa3 Apod Slc11a1 Cxcl16 Rab31 Fcgr3 BC016112 P2ry13 Ctsh 1810009M01Rik RAS-related C3 botulinum substrate 2 Apolipoprotein E Oligodendrocyte transcription factor 2 Annexin A3 Apolipoprotein D Solute carrier family 11 Chemokine (C-X-C motif) ligand 16 Rab31 protein Fcgr3 protein Col20a1 protein Purinergic receptor P2Y13 Cathepsin H LR8 protein NM 009008 NM 009696 NM 016967 AK169423 NM 007470 NM 013612 NM 023158 BC013063 BC052819 BC016112 NM 028808 NM 007801 NM_023056 0 0 0 0 0 0 0 9 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4.47 556.59 13.58 9.95 84.02 6.06 4.46 36.69 26.35 2.62 11.97 13.22 23.57 1.58 314.21 6.98 4.68 47.53 2.55 1.58 20.69 15.06 0.76 6.17 6.76 13.38 2.83 1.77 1.94 2.13 1.77 2.38 2.83 1.77 1.75 3.43 1.94 1.96 1.76 2.72 2.71 2.71 2.71 2.70 2.70 2.70 2.70 2.69 2.68 2.68 2.67 2.66 Mcam Cell surface glycoprotein MUC18 precursor AB035509 Parp12 Tcn2 Fcgr1 Csf2rb2 C5r1 Cd151 Cmtm3 Epsti1 Bzrp 0610039P13Rik Rpl39 Scotin Bcl3 Cnn3 Ifi47 Adfp Tnfrsf1b Lhfpl2 Ifi35 Emp3 Tcf7l2 AK128969 Tgm2 Lgmn Ucp2 Ela1 D11Lgp2e Rps27l Abhd4 Zfp36l1 Ctsl Bcl2a1b Ctsb 0610011I04Rik Dbi Prg1 Icam1 Cd109 Ccl3 Waspip Npc2 Aqp4 Rps14 Parp3 Trim14 Tspan2 Cyr61 Aebp1 Tnfaip2 Dnase1l1 Tnc Anxa5 Ephx1 AK153119 Tlr7 Mmp2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 26.25 6.39 10.99 11.23 14.11 14.29 2.51 4.96 5.15 6.79 1.84 2.55 2.21 2.18 2.08 2.99 2.99 2.98 2.98 2.97 NM_172893 1 0 1 0 0 7.20 2.87 2.51 2.97 NM 015749 NM 010186 0 0 0 0 0 0 0 0 0 0 19.96 6.63 10.54 2.64 1.89 2.52 2.96 2.96 AK155626 0 0 0 0 0 2.04 0.55 3.70 2.95 NM 007577 BC012236 0 0 0 0 0 0 0 0 0 0 3.23 13.80 0.88 6.75 3.67 2.05 2.95 2.93 BC003230 0 0 0 0 0 7.46 3.07 2.43 2.92 0 0 0 0 1 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1.90 4.33 9.16 11.50 22.93 2.60 26.55 1.77 5.59 0.52 1.42 3.95 5.32 12.42 0.71 14.74 0.52 2.19 3.62 3.06 2.32 2.16 1.85 3.68 1.80 3.40 2.55 2.91 2.88 2.88 2.87 2.87 2.85 2.85 2.83 2.83 NM NM NM NM NM NM NM NM NM 178825 009775 028752 026055 025858 033601 028044 008330 007408 NM_011610 0 0 0 0 0 7.31 3.11 2.35 2.83 NM 172589 NM 027320 NM 010129 AJ223070 1 0 0 9 0 0 0 0 1 0 0 9 0 0 0 0 0 0 0 0 11.88 3.02 2.29 5.23 5.89 0.84 0.63 1.95 2.02 3.57 3.66 2.69 2.82 2.82 2.82 2.82 AK128969 NM NM NM NM 009373 011175 011671 033612 Ribosomal protein S27-like NM 026467 Abhydrolase domain containing 4 NM 134076 Zinc finger protein 36, C3H type-like 1 NM 007564 Cathepsin L precursor AK159511 B-cell leukemia/lymphoma 2 related NM_007534 protein A1b NM 007798 Cathepsin B preproprotein Hepatocellular carcinoma-associated NM_025326 antigen 112 NM 001037999 Diazepam binding inhibitor isoform 1 Proteoglycan 1, secretory granule NM 011157 Intercellular adhesion molecule NM 010493 Hypothetical Alpha-macroglobulin receptor domain/Terpenoid cylases/Protein BC052443 prenyltransferases structure containing protein Chemokine (C-C motif) ligand 3 NM 011337 Wiskott-Aldrich syndrome protein NM_153138 interacting Niemann Pick type C2 NM 023409 Aquaporin-4 (AQP-4) (WCH4) (MercurialBC024526 insensitive water channel) (MIWC) Ribosomal protein S14 NM 020600 Poly (ADP-ribose) polymerase family, NM_145619 member 3 AK148837 Tripartite motif-containing 14 Hypothetical CD9/CD37/CD63 antigens AK020982 containing protein Cysteine rich protein 61 NM 010516 AEBP1 protein BC082577 Tumor necrosis factor, alpha-induced AK154405 protein 2 BC023246 Deoxyribonuclease I-like 1 precursor Tenascin C (Hexabrachion) AK085667 Annexin A5 NM 009673 Ephx1 protein BC057857 Hypothetical protein AK153119 Toll-like receptor 7 precursor AK154906 Matrix metalloproteinase 2 NM 008610 Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience 0 0 0 0 0 0 0 0 0 0 4.50 1.50 3.00 2.81 0 4 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 7.86 49.09 11.71 2.64 3.34 27.55 5.81 0.74 2.35 1.78 2.01 3.57 2.81 2.81 2.79 2.79 0 0 0 0 0 9.44 4.35 2.17 2.66 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7.61 38.83 17.26 62.27 3.38 21.78 9.32 35.74 2.25 1.78 1.85 1.74 2.65 2.65 2.64 2.64 0 0 0 0 0 2.30 0.67 3.44 2.63 1 0 0 0 1 174.71 100.36 1.74 2.63 0 0 0 0 0 10.51 5.16 2.04 2.63 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 55.28 8.49 4.49 31.49 3.85 1.70 1.76 2.20 2.65 2.62 2.61 2.60 0 0 0 0 0 2.38 0.70 3.39 2.59 0 0 0 0 0 3.94 1.33 2.96 2.59 2 0 2 0 0 12.07 6.37 1.90 2.59 0 0 0 0 0 43.30 24.85 1.74 2.59 1 0 1 0 0 35.71 20.55 1.74 2.59 1 0 1 0 0 21.63 12.39 1.75 2.57 0 0 0 0 0 5.57 2.37 2.35 2.57 0 0 0 0 0 2.07 0.64 3.23 2.57 1 0 0 0 1 44.17 25.55 1.73 2.57 0 0 0 0 0 0 0 0 0 0 5.91 8.49 2.59 3.94 2.28 2.16 2.57 2.56 0 0 0 0 0 2.85 0.87 3.27 2.56 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.97 2.68 21.03 18.49 2.37 6.81 2.85 0.62 0.82 12.04 10.10 0.71 3.09 0.88 3.19 3.28 1.75 1.83 3.33 2.20 3.24 2.56 2.55 2.55 2.54 2.54 2.53 2.53 15 Polymenidou et al Psmb9 Pqlc3 Mgst1 Ms4a6d Zc3hav1 Zfp488 Tmem86a Maff Hexb 1110007F12Rik AK051656 Rps26 Plek Prkcd AK048742 Pllp Ppgb Mpzl1 Aldh1a1 Ninj1 Nek6 BC013712 Olfml3 Atf3 Anxa4 Tspan4 Rpl32 Dbp Clic1 Col5a3 Vat1 Lyn Proteosome (prosome, macropain) subunit, beta Similar to mannose-P-dolichol utilization defect 1 protein homolog Microsomal glutathione S-transferase 1 MS4A6D protein Hypothetical protein Zinc finger protein 488 Hypothetical protein LOC67893 V-maf musculoaponeurotic fibrosarcoma oncogene Hexosaminidase B Hypothetical protein LOC68487 Protein KIAA1949 homolog 40S ribosomal protein S26 Pleckstrin Protein kinase C, delta Similar to Chromodomain-helicase-DNAbinding protein SUPPLEMENTARY INFORMATION NM_013585 0 0 0 0 0 2.14 0.66 3.22 2.53 AK138680 0 0 0 0 0 2.42 0.73 3.30 2.53 NM 019946 NM 026835 AK143568 NM 001013777 NM 026436 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10.11 4.62 3.61 4.02 10.94 4.95 1.80 1.22 1.46 5.60 2.04 2.57 2.95 2.76 1.95 2.52 2.52 2.51 2.51 2.50 NM_010755 0 0 0 0 0 1.67 0.56 2.96 2.50 NM 010422 NM 197986 AK051656 BC100456 NM 019549 NM 011103 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 75.76 1.49 7.93 27.29 29.57 9.26 44.83 0.53 3.68 16.03 17.32 4.51 1.69 2.81 2.16 1.70 1.71 2.05 2.50 2.48 2.48 2.48 2.48 2.47 AK048742 3 0 3 0 0 4.75 1.90 2.49 2.47 NM_026385 0 0 0 0 0 32.01 18.77 1.71 2.47 Protective protein for beta-galactosidase NM 001038492 Myelin protein zero-like 1 AK090278 Aldehyde dehydrogenase family 1, NM_013467 subfamily A1 Ninjurin 1 NM 013610 NIMA (never in mitosis gene a)-related NM_021606 expressed NM 001033308 Hypothetical protein LOC230787 Olfactomedin-like 3 NM 133859 Activating transcription factor ATF3b BC019946 Annexin A4 AK032703 Tetraspanin 4 NM 053082 Ribosomal protein L32 NM 172086 D site albumin promoter binding protein NM 016974 Chloride intracellular channel 1 NM 033444 Adipocyte-specific protein 6 AB040491 Vesicle amine transport protein 1 homolog NM 012037 0 5 0 0 0 5 0 0 0 0 43.83 13.64 26.11 7.37 1.68 1.85 2.46 2.46 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 Tyrosine-protein kinase Lyn (EC 2.7.1.112) 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 5 0 0 0 0 0 0 0 0 0 0 0 1 0 4 0 0 0 0 0 0 1 0 0 0 0 0 0 0 Transmembrane 4 superfamily member 11 BC031547 NM 011309 S100 calcium binding protein A1 Plasminogen activator, urokinase NM 008873 Peroxiredoxin 1 NM 011034 Colony stimulating factor 1 NM 007778 Toll-like receptor 3 NM 126166 G protein-coupled receptor 17 NM 001025381 Integrin beta-5 precursor BC058246 Protease, serine, 18 NM 011177 Transferrin NM 133977 Lipopolysaccharide inducible C-C Ccrl2 NM_017466 chemokine BC082538 P4ha3 P4ha3 protein Apobec3 Apolipoprotein B editing complex 3 AK153719 Gns Glucosamine (N-acetyl)-6-sulfatase NM 029364 Tcirg1 T-cell, immune regulator 1 NM 016921 Parvg Parvin, gamma NM 022321 AK042457 Flavin containing monooxygenase 1 AK042457 Serinc5 Serine incorporator 5 NM 172588 Gap junction epsilon-1 protein (ConnexinGje1 AK147913 29) Lxn Latexin NM 016753 A930021H16Rik Weakly similar to MICAL-like 2 AK151086 Serine caroboxypeptidase 1 AK153485 Scpep1 Sox9 SRY-box containing gene 9 NM 011448 Nupr1 P8 protein NM 019738 Pros1 Protein S (alpha) AK145501 Sfpi1 Transcription factor PU.1 L03215 CASP8 and FADD-like apoptosis regulator NM_207653 Cflar isoform Interleukin-10 receptor beta chain AK160991 Il10rb precursor AF076623 Nes Nes protein (Fragment) S100a13 S100 calcium binding protein A13 NM 009113 Proteasome (prosome, macropain) 28 Psme1 NM_011189 subunit NM 146023 Evi2b Ecotropic viral integration site 2b Naglu Alpha-N-acetylglucosaminidase NM 013792 Glipr2 GLI pathogenesis-related 2 NM 027450 Tyrosine-protein phosphatase non-receptor AK132509 Ptpn6 type 6 BC060276 BC060276 Filamin C (Gamma-filamin) (Filamin 2) BC046309 Novel protein BC046309 Sparc Secreted acidic cysteine rich glycoprotein AK148670 Apob48r Apolipoprotein B48 receptor NM 138310 Cd300d MAIR-Iib AB091768 Tgif TG-interacting factor NM 009372 Id3 Inhibitor of DNA binding 3 NM 008321 Litaf LPS-induced TN factor NM 019980 Hypothetical SAND/Sp100 containing AK150373 5830484A20Rik protein S100a1 Plau Prdx1 Csf1 Tlr3 Gpr17 Itgb5 Klk6 Trf 0 0 0 0 0 18.51 10.33 1.79 2.45 0 0 0 0 0 8.57 4.14 2.07 2.44 0 0 10.89 5.66 1.92 2.43 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2.82 15.03 2.40 4.72 5.11 23.83 23.96 8.39 1.45 26.33 0.90 8.31 0.76 1.92 2.16 14.34 14.46 4.02 0.53 15.78 3.12 1.81 3.18 2.46 2.37 1.66 1.66 2.09 2.74 1.67 2.43 2.43 2.43 2.42 2.42 2.42 2.42 2.41 2.41 2.41 6.83 3.17 2.15 2.39 40.59 4.84 40.08 12.14 11.10 24.29 21.94 8.26 135.42 24.20 2.04 23.94 6.65 5.84 14.61 13.15 4.00 82.88 1.68 2.37 1.67 1.82 1.90 1.66 1.67 2.07 1.63 2.39 2.38 2.38 2.38 2.36 2.36 2.35 2.35 2.34 1.39 0.52 2.67 2.34 4.46 5.37 45.72 8.29 5.93 6.68 37.22 1.83 2.44 28.30 4.07 2.79 3.22 22.46 2.44 2.20 1.62 2.04 2.13 2.07 1.66 2.33 2.32 2.32 2.31 2.31 2.31 2.31 2 0 2 0 0 96.07 59.23 1.62 2.31 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 20.88 1.97 15.18 20.07 8.30 11.06 5.42 12.57 0.67 8.64 12.18 4.12 5.90 2.49 1.66 2.93 1.76 1.65 2.01 1.87 2.18 2.31 2.30 2.30 2.30 2.30 2.30 2.29 1 0 0 0 1 13.76 7.83 1.76 2.29 0 0 0 0 0 7.84 3.84 2.04 2.28 0 0 0 0 0 0 0 0 0 0 7.21 12.91 3.52 7.24 2.05 1.78 2.28 2.28 0 0 0 0 0 9.19 4.67 1.97 2.27 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14.44 8.57 3.49 8.24 4.29 1.30 1.75 2.00 2.68 2.27 2.27 2.27 0 0 0 0 0 5.80 2.76 2.10 2.26 0 0 3 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 2.56 4.31 119.54 1.61 9.78 2.94 16.04 18.08 0.87 1.78 74.49 0.59 5.12 1.03 9.47 10.54 2.94 2.42 1.60 2.70 1.91 2.86 1.69 1.72 2.26 2.26 2.26 2.25 2.25 2.25 2.25 2.24 0 0 0 0 0 1.42 0.55 2.58 2.24 Hcls1 Hematopoietic cell specific Lyn substrate 1 NM_008225 0 0 0 0 0 5.13 2.27 2.26 2.23 Pabpc1 Poly A binding protein, cytoplasmic 1 Ubiquitin-specific protease 2 isoform Usp245 NM 008774 0 0 0 0 0 55.02 34.31 1.60 2.22 NM_198091 1 0 1 0 0 22.60 13.99 1.62 2.22 Usp2 Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience 16 Polymenidou et al SUPPLEMENTARY INFORMATION Protein tyrosine phosphatase, receptor type, C Sdc4 protein Clusterin NM_011210 2 0 1 0 1 4.38 1.86 2.36 2.21 Sdc4 Clu BC002312 NM 013492 3 1 0 0 0 0 0 0 3 1 46.40 259.47 29.47 163.59 1.57 1.59 2.21 2.20 H2-Eb1 Histocompatibility 2, class II antigen E beta NM_010382 0 0 0 0 0 2.45 0.85 2.87 2.20 Ikbke Inhibitor of kappaB kinase epsilon NM 019777 0 0 0 0 0 1.67 0.62 2.70 2.20 Slc7a2 Solute carrier family 7 (cationic amino acid NM_007514 4 0 0 0 4 18.08 10.65 1.70 2.20 0 0 0 0 0 29.58 18.32 1.61 2.20 0 0 0 0 0 3.69 1.40 2.64 2.20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5.58 2.92 10.56 2.69 1.05 5.80 2.08 2.80 1.82 2.19 2.19 2.19 0 0 0 0 0 1.38 0.54 2.54 2.19 0 0 0 0 0 5.33 2.52 2.12 2.19 Ptprc Lamp2 H2-Aa Tor3a Arl11 Hspb8 Stat5a Rnf122 Tgfbi Fgl2 Erbb2ip Slc6a6 Ptgds2 Rgma Ehd4 Lats2 Add3 Csf2rb1 9230117N10Rik AK046834 Ctdsp1 BC023356 Gm2a Rhoc Chi3l1 Vcam1 Rpl14 Anpep Ly6e Pbxip1 Cldn11 Cpne3 Sema6a Hpse Il1a Ltbr S100a4 Syngr2 Cd83 Pycard Dap Lysosomal membrane glycoprotein 2 NM_001017959 isoform 1 Histocompatibility 2, class II antigen A, NM_010378 alpha BC071216 Torsin family 3, member A ADP-ribosylation factor-like 11 NM 177337 Heat shock protein 20-like protein NM 030704 Signal transducer and activator of NM_011488 transcription Hypothetical RING finger containing AK171823 protein Transforming growth factor-beta-induced AK155828 protein ig-h3 precursor (Beta ig-h3) Fibrinogen-like protein 2 NM 008013 Erbb2 interacting protein isoform 2 NM 021563 Solute carrier family 6 (neurotransmitter AK162776 transporter, taurine), member 6 Prostaglandin D2 synthase 2, NM_019455 hematopoietic RGM domain family, member A NM 177740 EH domain containing protein MPAST2 NM 133838 LATS2B AY015061 Adducin 3 (gamma) AK144174 Colony stimulating factor 2 receptor, beta AK170199 1, low-affinity (granulocyte-macrophage) NM 133775 Interleukin 33 Hypothetical ATP-dependent DNA ligase AK046834 containing protein CTD (carboxy-terminal domain, RNA NM_153088 polymerase II Transcription factor SOX-10 (SOX-21) BC023356 (Transcription factor SOX-M) NM 010299 GM2 ganglioside activator protein Ras homolog gene family, member C NM 007484 Chitinase 3-like 1 NM 007695 Vascular cell adhesion molecule 1 AK030195 Ribosomal protein L14 NM 025974 Alanyl (membrane) aminopeptidase NM 008486 Lymphocyte antigen Ly-6E precursor BC005684 Pre-B-cell leukemia transcription factor NM 146131 Claudin 11 NM 008770 Copine III NM 027769 Sema domain, transmembrane domain (TM), and cytoplasmic domain, AK042751 (semaphorin) 6A Heparanase NM 152803 Interleukin 1 alpha NM 010554 Lymphotoxin B receptor NM 010736 S100 calcium binding protein A4 NM 011311 Synaptogyrin 2 NM 009304 CD83 antigen BC107344 PYD and CARD domain containing NM 023258 Death-associated protein NM 146057 0 0 0 0 0 3.85 1.52 2.54 2.19 0 15 0 0 0 14 0 1 0 0 2.20 52.87 0.76 33.12 2.88 1.60 2.18 2.18 10 0 7 0 3 65.13 41.37 1.57 2.17 0 0 0 0 0 8.10 4.14 1.95 2.17 0 0 3 14 0 0 0 0 0 0 2 14 0 0 0 0 0 0 1 0 21.16 8.51 8.99 38.18 13.15 4.41 4.72 23.73 1.61 1.93 1.90 1.61 2.17 2.16 2.16 2.15 0 0 0 0 0 2.66 0.94 2.84 2.15 0 0 0 0 0 35.00 22.20 1.58 2.14 1 0 1 0 0 6.80 3.39 2.01 2.14 0 0 0 0 0 18.40 11.18 1.65 2.13 0 0 0 0 0 19.60 12.35 1.59 2.10 0 0 0 0 0 1 0 0 1 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 44.63 8.55 7.43 12.23 13.16 3.27 66.62 19.28 121.77 12.09 28.75 4.50 3.80 7.14 7.69 1.25 43.05 12.04 78.83 7.11 1.55 1.90 1.95 1.71 1.71 2.63 1.55 1.60 1.54 1.70 2.10 2.10 2.10 2.10 2.09 2.09 2.09 2.09 2.08 2.07 4 0 4 0 0 10.85 6.10 1.78 2.07 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2.46 2.04 5.40 3.08 6.68 8.06 3.35 5.44 0.91 0.76 2.68 1.19 3.44 4.24 1.35 2.73 2.72 2.69 2.01 2.60 1.94 1.90 2.48 1.99 2.07 2.06 2.05 2.05 2.05 2.05 2.04 2.04 Trp53inp1 Transformation related protein 53 inducible NM_021897 0 0 0 0 0 9.11 4.94 1.84 2.04 Cotl1 Stom Pdlim4 Npl Sgpl1 Coactosin-like protein Stom protein Reversion induced LIM gene N-acetylneuraminate pyruvate lyase Sphingosine phosphate lyase 1 Voltage-dependent calcium channel gamma-5 Hypothetical Actin-crosslinking proteins containing protein T-cell acute lymphocytic leukemia 1 Adaptin-ear-binding coat-associated protein 2 Bcl2-associated athanogene 3 Mannosidase alpha class 2B member 2 Vesicle-associated membrane protein 8 Interferon consensus sequence binding protein 1 Signal transducer and activator of transcription Stress-associated endoplasmic reticulum protein Cyclin G1 Loss of heterozygosity, 11, chromosomal region BC010249 BC003789 NM 019417 NM 028749 NM 009163 3 0 0 0 1 0 0 0 0 0 3 0 0 0 1 0 0 0 0 0 0 0 0 0 0 42.36 9.75 5.47 5.27 22.27 27.47 5.41 2.73 2.60 14.29 1.54 1.80 2.00 2.03 1.56 2.04 2.04 2.04 2.04 2.04 NM_080644 3 0 3 0 0 7.26 3.80 1.91 2.03 AK090002 1 0 1 0 0 4.32 1.94 2.22 2.03 NM 011527 0 0 0 0 0 1.36 0.57 2.39 2.02 NM_025383 1 0 1 0 0 18.09 11.08 1.63 2.02 NM 013863 NM 008550 NM 016794 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 6.88 16.04 9.46 3.62 9.90 5.26 1.90 1.62 1.80 2.02 2.01 2.01 BC005450 1 0 1 0 0 5.86 3.00 1.95 2.01 Cacng5 4930429M06Rik Tal1 Necap2 Bag3 Man2b2 Vamp8 Irf8 Stat3 D3Ucla1 Ccng1 Loh11cr2a Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience NM_011486 3 0 3 0 0 20.13 12.93 1.56 2.01 NM_030685 1 0 0 0 1 29.42 18.99 1.55 2.01 NM 009831 0 0 0 0 0 35.70 23.34 1.53 2.00 NM_172767 1 0 0 0 1 11.14 6.60 1.69 2.00 17 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Table 2 Genes downregulated upon TDP-43 depletion in adult mouse brain CLIP-seq number of clusters Gene symbol Protein Kl Park2 Fgf14 Sdk1 Efna5 Kcnip4 T-cell lymphoma breakpoint associated target 1 TAR DNA binding protein isoform 5 Calcium-dependent secretion activator 1 CUB and Sushi multiple domains 1 Ctnna2 protein Hypothetical carbonic anhydrase structure containing protein Calsyntenin 2 Neuregulin 3 Hypothetical protein LOC72899 Adenylate cyclase 2 Potassium voltage-gated channel, Shalrelated Low density lipoprotein-related protein 1B (deleted in tumors) Calcium channel, voltage-dependent, alpha2/delta Neurexin III Hyaluronan and proteoglycan link protein 4 precursor Klotho Parkin Fibroblast growth factor 14 isoform b Sidekick homolog 1 Ephrin A5 isoform 2 Kv channel interacting protein 4 Cntnap2 Tmem28 Sntg1 Hs6st3 AK134882 Kcnj3 Hdh Arc Rims1 Klf10 Nlgn1 Kcnk4 Folr1 Slc17a8 Tcba1 Tardbp Cadps Csmd1 Ctnna2 Car10 Clstn2 Nrg3 2900006F19Rik Adcy2 Kcnd2 Lrp1b Cacna2d3 Nrxn3 in in 5'UTR introns RNA-seq Refseq/mRNA identifier in RPKM in in 3'UTR exons TDP-43 KD RPKM in Ratio Z score control KD/control Total NM_001025286 58 0 57 1 0 3.16 18.06 0.18 -8.17 NM_001003898 BC079679 NM_053171 BC079648 5 56 154 115 0 0 0 0 0 56 152 115 0 0 1 0 5 0 1 0 10.17 7.62 2.51 8.80 43.61 23.80 9.43 22.44 0.23 0.32 0.27 0.39 -6.91 -5.90 -5.39 -4.88 AK012302 52 1 51 0 0 6.05 15.81 0.38 -4.67 NM_022319 NM_008734 NM_028387 NM_153534 38 132 36 15 0 0 0 0 38 131 36 15 0 0 0 0 0 1 0 0 7.70 2.75 2.94 8.57 19.32 8.15 8.52 20.60 0.40 0.34 0.34 0.42 -4.59 -4.47 -4.35 -4.25 NM_019697 78 0 77 0 1 14.60 34.39 0.42 -4.18 AK136118 87 0 87 0 0 2.59 6.87 0.38 -4.08 NM_009785 63 0 63 0 0 12.00 25.88 0.46 -4.05 BC060719 178 0 178 0 0 16.25 37.23 0.44 -3.93 AK034300 0 0 0 0 0 9.71 20.10 0.48 -3.86 NM_013823 AF250294 NM_207667 NM_177879 NM_010109 AK148685 0 39 85 29 21 109 0 0 0 0 0 0 0 39 85 29 21 109 0 0 0 0 0 0 0 0 0 0 0 0 2.53 0.63 6.99 0.51 3.90 3.98 6.33 2.18 14.99 1.54 9.33 9.30 0.40 0.29 0.47 0.33 0.42 0.43 -3.84 -3.78 -3.76 -3.69 -3.69 -3.58 Contactin associated protein-like 2 isoform b NM_025771 116 0 116 0 0 11.00 21.03 0.52 -3.52 Transmembrane protein 28 Syntrophin, gamma 1 Heparan sulfate 6-O-sulfotransferase 3 Hypothetical protein Potassium inwardly-rectifying channel J3 Huntington disease gene homolog Activity regulated cytoskeletal-associated Rims1 protein Kruppel-like factor 10 Neuroligin 1 TRAAK Folate receptor 1 Vesicular glutamate transporter-3 Similar to potassium channel protein erg long isoform weakly similar to Keratin 8 Choline acetyltransferase Regulator of G-protein signaling 6 Slit homolog 3 Glutamate receptor, metabotropic 7 Protein tyrosine phosphatase, receptor type, N polypeptide 2 NM_173446 NM_027671 NM_015820 AK134882 NM_008426 NM_010414 NM_018790 BC055294 NM_013692 NM_138666 NM_008431 NM_008034 NM_182959 79 23 80 2 21 10 0 6 0 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 79 23 79 2 21 9 0 6 0 101 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 6.21 2.01 4.14 2.47 6.57 8.52 25.40 15.32 6.55 8.55 0.76 0.67 1.08 12.27 4.88 8.95 5.55 12.70 16.55 50.51 29.50 12.13 16.28 2.07 1.81 2.84 0.51 0.41 0.46 0.44 0.52 0.51 0.50 0.52 0.54 0.53 0.37 0.37 0.38 -3.49 -3.43 -3.35 -3.33 -3.32 -3.29 -3.29 -3.25 -3.19 -3.19 -3.16 -3.16 -3.16 AK034003 23 0 23 0 0 2.90 5.92 0.49 -3.12 AK171114 NM_009891 NM_015812 NM_011412 NM_177328 0 2 37 40 78 0 0 0 0 0 0 2 37 40 77 0 0 0 0 1 0 0 0 0 0 0.56 0.97 1.68 4.10 8.82 1.39 2.59 4.03 8.39 16.33 0.40 0.37 0.42 0.49 0.54 -3.11 -3.11 -3.10 -3.10 -3.09 AK138655 64 0 63 0 1 18.06 34.41 0.52 -3.08 Lsamp Limbic system-associated membrane protein AK140043 71 0 70 0 1 15.54 28.93 0.54 -3.07 Opcml Opioid binding protein/cell adhesion Hypothetical Glycosyl transferase, family 2 containing protein Islet cell autoantigen 1 Procollagen, type VIII, alpha 2 KCNQ5 potassium channel Mitogen-activated protein kinase 11 Similar to BA93M14.1 (Glypican 5) Zeta-sarcoglycan (Zeta-SG) (ZSG1) Astrotactin 2 isoform a Hypothetical protein LOC170711 Potassium inwardly-rectifying channel, subfamily J, member 6 Myosin, light polypeptide 4 Cystine knot-containing secreted protein Natriuretic peptide receptor 3 isoform b Carboxyl-terminal PDZ ligand of neuronal nitric oxide synthase homolog Kin of IRRE-like 2 Vesicle-associated membrane protein 1 Thyrotropin-releasing hormone degrading Glial cell line derived neurotrophic factor Hypothetical protein LOC243274 Adrenergic receptor, alpha 1d Weakly similar to uridine phosphorylase TAFA1 protein Neurexin-1-alpha precursor (Neurexin Ialpha) Hypothetical Proline-rich region profile/Serine-rich region profile containing protein NM_177906 116 0 115 0 1 18.55 35.28 0.53 -3.06 BC111102 23 0 23 0 0 0.63 1.54 0.41 -3.00 NM_010492 NM_199473 AK147264 NM_011161 AK084223 AK136779 NM_019514 NM_130880 16 0 36 0 24 56 72 30 0 0 0 0 0 1 0 0 16 0 36 0 24 55 72 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4.83 0.61 14.96 5.36 3.87 0.74 5.08 4.47 9.23 1.46 26.18 9.86 7.69 1.85 9.45 8.80 0.52 0.42 0.57 0.54 0.50 0.40 0.54 0.51 -2.99 -2.98 -2.97 -2.97 -2.96 -2.95 -2.95 -2.95 Hapln4 Kcnh7 AK171114 Chat Rgs6 Slit3 Grm7 Ptprn2 4930431L04Rik Ica1 Col8a2 Kcnq5 Mapk11 Gpc5 Sgcz Astn2 Otud7 Kcnj6 Myl4 Sostdc1 Npr3 AK138471 Kirrel2 Vamp1 Trhde Gfra2 C630028F04Rik Adra1d Upp2 C630007B19Rik Nrxn1 AB125594 Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience AK044218 28 0 27 0 1 3.28 6.37 0.52 -2.92 NM_010858 NM_025312 NM_001039181 0 0 2 0 0 0 0 0 2 0 0 0 0 0 0 2.83 0.75 0.64 5.55 1.82 1.52 0.51 0.41 0.43 -2.91 -2.90 -2.89 AK138471 38 0 38 0 0 5.60 9.97 0.56 -2.88 NM_172898 BC049902 NM_146241 NM_008115 NM_172885 NM_013460 BC027189 NM_182808 0 1 17 6 43 0 23 56 0 1 0 0 0 0 0 0 0 0 17 6 43 0 23 56 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0.86 19.15 2.82 9.38 3.15 0.99 0.77 1.93 2.12 34.73 5.43 16.06 5.93 2.30 1.77 4.10 0.41 0.55 0.52 0.58 0.53 0.43 0.43 0.47 -2.85 -2.83 -2.81 -2.80 -2.79 -2.77 -2.76 -2.76 AB093249 162 0 162 0 0 18.66 33.28 0.56 -2.76 AB125594 45 0 43 0 2 9.06 15.71 0.58 -2.76 18 Polymenidou et al Ptprt Grm8 Siglech Ryr2 Nrn1 A830018L16Rik Smyd3 Tssc1 AK161480 Trpc3 Gabra3 Aqp1 Plcb1 Kcnk4 Kcnma1 Protein tyrosine phosphatase, receptor type, T Glutamate receptor, metabotropic 8 SIGLEC-like protein Ryr2 protein Neuritin 1 VEST-1 protein homolog SET and MYND domain containing 3 Tumor suppressing subtransferable candidate 1 Hypothetical protein Transient receptor potential cation channel Gamma-aminobutyric acid (GABA-A) receptor Aquaporin 1 Plcb1 protein Potassium channel subfamily K member 4 Calcium-activated potassium channel alpha subunit 1 SUPPLEMENTARY INFORMATION NM_021464 74 0 74 0 0 8.13 14.29 0.57 -2.75 NM_008174 NM_178706 BC043140 NM_153529 AK135047 NM_027188 21 0 41 2 8 39 0 0 0 0 0 0 21 0 41 2 8 39 0 0 0 0 0 0 0 0 0 0 0 0 1.18 6.14 1.82 17.81 15.04 4.41 2.67 10.50 3.83 31.03 24.75 8.20 0.44 0.59 0.47 0.57 0.61 0.54 -2.75 -2.74 -2.73 -2.73 -2.73 -2.72 NM_201357 6 0 6 0 0 4.51 8.29 0.54 -2.72 AK161480 NM_019510 0 3 0 0 0 3 0 0 0 0 0.55 3.69 1.18 6.83 0.46 0.54 -2.71 -2.71 NM_008067 2 0 2 0 0 15.39 25.51 0.60 -2.71 NM_007472 BC058710 BC061075 0 59 0 0 0 0 0 59 0 0 0 0 0 0 0 0.67 24.54 0.57 1.44 44.03 1.21 0.46 0.56 0.47 -2.68 -2.68 -2.67 U09383 96 0 96 0 0 8.34 14.36 0.58 -2.66 Cacna1c Calcium channel, voltage-dependent, L type, NM_009781 77 0 77 0 0 5.57 9.56 0.58 -2.65 Slc10a4 E030025D05Rik Ttr Solute carrier family 10 E030025D05Rik protein Transthyretin Zinc transporter 3 (ZnT-3) (Solute carrier family 30 member 3) Potassium voltage-gated channel, Hypothetical protein LOC66771 Two cut domains-containing homeodomain protein Kin of IRRE-like protein 3 precursor (Kin of irregular chiasm-like protein 3) Coagulation factor V Similar to neurula-specific ferrodoxin reductase-like protein homolog Adenylate cyclase 1 Hypothetical protein LOC320495 Glutamate receptor, ionotropic, NMDA2A (epsilon) NM_173403 AK173298 NM_013697 0 20 0 0 1 0 0 18 0 0 0 0 0 1 0 1.24 6.06 44.88 2.70 10.10 80.56 0.46 0.60 0.56 -2.65 -2.62 -2.62 Slc30a3 Kcns3 4933439F18Rik Satb2 Kirrel3 F5 2810401C16Rik Adcy1 A130090K04Rik Grin2a Cacna1e Calcium channel, voltage-dependent, R type BC066199 0 0 0 0 0 5.66 9.55 0.59 -2.60 NM_173417 NM_025757 0 1 0 1 0 0 0 0 0 0 0.58 29.33 1.20 51.30 0.48 0.57 -2.60 -2.59 NM_139146 5 0 5 0 0 6.79 10.98 0.62 -2.58 AK045373 74 0 74 0 0 6.02 9.99 0.60 -2.58 NM_007976 0 0 0 0 0 0.94 2.08 0.45 -2.57 AK158809 3 0 2 0 1 10.73 17.93 0.60 -2.57 NM_009622 NM_001033391 17 3 0 0 12 3 0 0 5 0 38.58 2.68 68.45 4.95 0.56 0.54 -2.56 -2.56 NM_008170 83 0 83 0 0 2.19 4.23 0.52 -2.56 NM_009782 40 0 34 6 0 8.43 14.18 0.59 -2.55 81 8 0 0 81 8 0 0 0 0 16.54 7.73 27.53 12.71 0.60 0.61 -2.54 -2.53 59 1 57 0 1 7.07 11.27 0.63 -2.53 3 0 3 0 0 0.61 1.24 0.49 -2.53 39 0 39 0 0 0.57 1.16 0.49 -2.52 2 3 0 0 2 3 0 0 0 0 10.21 5.59 16.56 9.29 0.62 0.60 -2.51 -2.51 28 0 28 0 0 35.79 62.69 0.57 -2.51 0 0 0 0 0 0.52 1.06 0.50 -2.50 28 0 28 0 0 20.31 34.26 0.59 -2.48 26 62 110 34 0 0 0 0 26 62 109 34 0 0 1 0 0 0 0 0 2.37 1.14 18.54 2.34 4.38 2.38 30.84 4.32 0.54 0.48 0.60 0.54 -2.48 -2.47 -2.47 -2.47 52 0 52 0 0 17.00 27.85 0.61 -2.46 20 0 20 0 0 15.37 23.88 0.64 -2.46 3 0 3 0 0 7.38 11.76 0.63 -2.45 0 2 2 4 4 0 0 0 0 0 0 2 2 4 3 0 0 0 0 0 0 0 0 0 1 15.86 10.60 8.94 7.34 18.66 24.95 16.93 14.67 11.59 30.56 0.64 0.63 0.61 0.63 0.61 -2.45 -2.45 -2.42 -2.41 -2.41 63 0 63 0 0 9.67 15.47 0.62 -2.41 0 75 0 0 0 0 0 74 0 0 0 0 0 1 0 2.49 10.39 1.45 4.46 16.43 2.83 0.56 0.63 0.51 -2.40 -2.40 -2.39 Fibroblast growth factor 12 isoform b NM_010199 Weakly similar to copine VII AK014396 Weakly similar to heterogeneous nuclear BC052358 0710005M24Rik ribonucleoproteins C1/C2 (hnRNPC1/C2) NM_007733 Col19a1 Procollagen, type XIX, alpha 1 Hypothetical calcium-binding EGF-like AK018073 AK018073 domain containing protein homolog G protein-coupled receptor 26 NM_173410 Gpr26 Cpne9 Similar to copine-like protein KIAA1599 AK044080 Hypothetical EF-hand/Phorbol esters/diacylglycerol binding AK083222 Dgkb domain/Cytochrome c family heme-binding site containing protein Leucine-rich repeat-containing protein 17 BC030317 BC030317 precursor Calcium/calmodulin-dependent protein NM_009793 Camk4 kinase IV NM_011855 Odz1 Odd Oz/ten-m homolog 1 Wwox WW-domain oxidoreductase NM_019573 Igsf4d Weakly similar to BK134P22.1 AK046800 AK138164 NB-2 homolog AK138164 Glutamate receptor, ionotropic, kainate 2 NM_010349 Grik2 (beta) NM_001033399 Gfod1 Hypothetical protein LOC328232 Glutamate receptor, ionotropic, NMDA2C NM_010350 Grin2c (epsilon) BC014845 Klc2 Klc2 protein Ttc15 Weakly similar to CGI-87 PROTEIN AK031763 L42339 Sodium channel 3 L42339 Rtn4r Nogo receptor NM_022982 Syt13 Synaptotagmin XIII NM_030725 Weakly similar to Hypothetical band 4.1 AK147370 Pdzd10 family containing protein NM_021306 Ecel1 Damage-induced neuronal endopeptidase Dpp10 Dipeptidyl peptidase 10 NM_199021 Fbxw8 F-box protein FBX29 AK015338 Collagen and calcium binding EGF domains NM_178793 Ccbe1 1 NM_053195 Slc24a3 Solute carrier family 24 Snrp70 U1 small nuclear ribonucleoprotein 70 kDa NM_009224 Podxl2 Podocalyxin-like 2 NM_176973 A930001M12Rik Hypothetical protein LOC320500 NM_177175 Gpr158 G protein-coupled receptor 158 isoform b NM_175706 Zfp180 Zfp180 protein BC063741 Cbln2 Cerebellin 2 precursor protein NM_172633 Stx1a Syntaxin 1A NM_016801 Neural cell adhesion molecule 2 precursor AF001287 Ncam2 (N-CAM 2) (RB-8 neural cell adhesion molecule) AK145822 Fosl2 Fos-like antigen 2 Fgf12 Cpne4 Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience 4 0 3 0 1 0.61 1.18 0.52 -2.37 17 2 1 0 20 3 0 1 0 0 0 0 0 1 0 0 17 1 1 0 20 2 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 13.17 69.49 12.22 0.85 29.70 16.63 3.66 24.86 20.23 117.85 18.69 1.68 48.92 26.55 6.17 41.33 0.65 0.59 0.65 0.51 0.61 0.63 0.59 0.60 -2.36 -2.36 -2.35 -2.34 -2.34 -2.34 -2.33 -2.33 37 0 37 0 0 7.79 12.23 0.64 -2.33 1 0 1 0 0 6.83 10.53 0.65 -2.32 19 Polymenidou et al Pcdh7 Pde1a Tnfrsf25 Apba2bp Kcns2 BC052055 Alas2 D14Ertd171e Magi2 Galnt9 Nr4a3 Cntn4 Atp2b3 Kcnj9 Large Accn1 Rasgrf2 Xkr4 Cabp1 Scube1 Ephb6 AK147314 Wnt9a 2900046G09Rik Kcnab3 Atrnl1 Kcnmb4 Nr4a1 Phospho1 Bai3 Reps2 Mamdc1 Emx1 Slit2 Rgs7 Rims4 Osbpl10 Slc5a7 Tbr1 Rfx3 Lynx1 Itpka Irxl1 Sema5b Hnt Prkag1 Chrm3 3110035E14Rik 2610044O15Rik Rit2 AK142702 Egr4 Mical2 Sstr3 Ldlr Hcn1 E330039K12Rik Lrrn6c BC107398 Sidt1 Oprd1 Insig1 Mib1 B230343H07Rik Fxyd7 Unc5d Me3 Cacna1a S69381 Protocadherin 7 NM_018764 Phosphodiesterase 1A, calmodulinNM_001009978 dependent BC017526 Tnfrsf25 protein Amyloid beta A4 protein-binding family A AK013520 member 2-binding protein (X11L-binding protein 51) (mXB51) NM_181317 K+ voltage-gated channel, subfamily S, 2 Hypothetical protein LOC328643 NM_182636 Aminolevulinic acid synthase 2, erythroid NM_009653 CAZ-associated structural protein NM_177814 Activin receptor interacting protein 1 AK147530 UDP-N-acetyl-alpha-DNM_198306 galactosamine:polypeptide Orphan nuclear receptor NR4A3 (Orphan BC068150 nuclear receptor TEC) (Translocated in extraskeletal chondrosarcoma) AK163520 Contactin 4 Plasma membrane calcium ATPase 3 NM_177236 G protein-activated inward rectifier potassium channel 3 (GIRK3) (Potassium channel, inwardly rectifying subfamily J BC065161 member 9) (Inwardly rectifier K(+) channel Kir3.3) like-glycosyltransferase NM_010687 Accn1 protein BC038551 RAS protein-specific guanine nucleotideAK158472 releasing factor 2 NM_001011874 XK-related protein 4 calcium binding protein 1 NM_013879 Scube1 protein BC066066 Eph receptor B6 NM_007680 Delta kalirin-7 homolog AK147314 Wnt9a protein BC066165 Hypothetical protein LOC78408 NM_133778 Voltage-gated potassium channel beta-3 NM_010599 subunit (K(+) channel beta-3 subunit) (Kvbeta-3) NM_181415 Attractin-like 1 Calcium activated potassium channel beta 4 NM_021452 Nuclear receptor subfamily 4, group A, NM_010444 member 1 Phosphatase, orphan 1 NM_153104 Brain-specific angiogenesis inhibitor 3 BC032251 RalBP1 associated Eps domain containing AK147418 protein Hypothetical Phenylalanine-rich region AK163637 profile containing protein NM_010131 Empty spiracles homolog 1 Slit homolog 2 protein precursor (Slit-2) AK220505 Regulator of G protein signaling 7 NM_011880 Regulating synaptic membrane exocytosis 4 NM_183023 Oxysterol-binding protein-like protein 10 NM_148958 Solute carrier family 5 (choline transporter) NM_022025 T-box brain gene 1 AK158835 Regulatory factor X, 3 NM_011265 Ly6/neurotoxin 1 NM_011838 Inositol 1,4,5-trisphosphate 3-kinase A NM_146125 Hypothetical protein LOC210719 NM_177595 Semaphorin-5B precursor (Semaphorin G) BC052397 Neurotrimin precursor BC023307 AMP-activated protein kinase subunit AK157795 gamma 1 NM_033269 Cholinergic receptor, muscarinic 3, cardiac Hypothetical protein LOC76982 NM_178399 Hypothetical protein AK137318 Ras-like without CAAX 2 NM_009065 Hypothetical protein AK142702 Early growth response 4 NM_020596 Flavoprotein oxidoreductase MICAL2 NM_177282 Somatostatin receptor 3 NM_009218 Low density lipoprotein receptor AK170264 Hyperpolarization-activated, cyclic NM_010408 Hypothetical homeodomain-like structure AK220342 containing protein NM_175516 Leucine rich repeat neuronal 6C Hypothetical protein BC107398 SID1 transmembrane family, member 1 NM_198034 Opioid receptor, delta 1 NM_013622 Insulin induced gene 1 NM_153526 Ubiquitin ligase mind bomb NM_144860 RIKEN cDNA B230343H07 NM_001037906 FXYD domain-containing ion transport NM_022007 regulator NM_153135 Netrin receptor Unc5h4 Hypothetical protein FLJ34862 (Malic AK133593 enzyme) homolog Calcium channel, voltage-dependent, P/Q AK052625 type, alpha 1A subunit Potassium voltage-gated channel subfamily S69381 C member 3 Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience SUPPLEMENTARY INFORMATION 0 0 0 0 0 8.97 14.37 0.62 28 0 28 0 0 11.79 17.86 0.66 -2.32 -2.32 1 0 1 0 0 1.64 3.11 0.53 -2.32 0 0 0 0 0 6.64 10.29 0.65 -2.31 0 0 0 62 62 0 0 0 0 0 0 0 0 61 62 0 0 0 0 0 0 0 0 1 0 6.22 4.33 0.53 17.51 15.73 9.68 7.20 1.00 27.80 23.87 0.64 0.60 0.53 0.63 0.66 -2.30 -2.30 -2.30 -2.30 -2.30 9 0 9 0 0 5.48 8.80 0.62 -2.30 0 0 0 0 0 4.32 7.17 0.60 -2.30 37 2 0 0 37 2 0 0 0 0 6.24 15.15 9.63 23.00 0.65 0.66 -2.29 -2.28 0 0 0 0 0 14.13 21.28 0.66 -2.28 38 110 0 0 38 110 0 0 0 0 17.42 5.29 27.47 8.55 0.63 0.62 -2.28 -2.28 1 0 1 0 0 5.30 8.57 0.62 -2.27 42 2 4 1 33 1 0 0 0 0 0 0 0 0 42 2 4 1 33 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3.75 11.29 3.63 13.80 38.00 1.75 24.10 6.24 17.56 5.88 20.81 63.06 3.26 39.57 0.60 0.64 0.62 0.66 0.60 0.54 0.61 -2.27 -2.26 -2.26 -2.26 -2.25 -2.25 -2.25 0 0 0 0 0 4.85 7.87 0.62 -2.25 37 4 0 0 37 4 0 0 0 0 11.96 8.43 17.88 12.95 0.67 0.65 -2.24 -2.23 0 0 0 0 0 26.01 42.41 0.61 -2.23 0 91 0 0 0 91 0 0 0 0 1.90 13.83 3.41 20.65 0.56 0.67 -2.22 -2.21 -2.21 4 0 4 0 0 25.57 41.37 0.62 74 1 73 0 0 5.78 8.96 0.65 -2.21 0 10 55 5 27 0 2 30 0 0 5 1 127 0 0 1 0 0 0 0 0 0 0 0 0 0 0 10 54 5 27 0 0 30 0 0 5 1 127 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0.99 2.05 9.76 3.62 3.10 3.50 15.44 3.30 62.35 17.25 1.79 3.97 21.16 1.89 3.63 15.02 5.80 5.03 5.60 23.26 5.27 101.97 26.73 3.27 6.42 33.56 0.53 0.56 0.65 0.62 0.62 0.62 0.66 0.63 0.61 0.65 0.55 0.62 0.63 -2.20 -2.20 -2.20 -2.20 -2.20 -2.20 -2.19 -2.19 -2.19 -2.19 -2.19 -2.18 -2.18 0 0 0 0 0 2.04 3.59 0.57 -2.18 42 6 1 10 0 0 14 0 1 23 17 0 0 0 0 0 0 0 0 0 42 5 1 10 0 0 14 0 1 23 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 18.85 40.66 9.02 12.49 0.84 6.65 20.36 2.43 3.78 4.69 29.40 66.20 13.90 18.40 1.54 9.92 31.81 4.08 6.03 7.45 0.64 0.61 0.65 0.68 0.55 0.67 0.64 0.60 0.63 0.63 -2.17 -2.17 -2.16 -2.16 -2.16 -2.15 -2.15 -2.15 -2.14 -2.13 19 0 19 0 0 10.34 15.48 0.67 -2.13 1 1 13 0 0 15 19 0 0 0 0 0 0 0 1 1 13 0 0 12 19 0 0 0 0 0 2 0 0 0 0 0 0 1 0 2.24 1.04 3.38 7.33 9.29 14.45 3.48 3.84 1.93 5.31 10.66 14.22 21.16 5.48 0.58 0.54 0.64 0.69 0.65 0.68 0.64 -2.13 -2.13 -2.12 -2.12 -2.12 -2.12 -2.11 -2.11 1 0 1 0 0 4.44 7.04 0.63 45 0 45 0 0 1.48 2.62 0.57 -2.11 2 0 2 0 0 5.74 8.73 0.66 -2.10 35 0 35 0 0 20.73 32.14 0.65 -2.10 0 0 0 0 0 19.96 30.98 0.64 -2.10 20 Polymenidou et al Neurotractin isoform a NM_001039094 Similar to CUB and sushi multiple domains AK144913 protein 2 ST6 NM_011372 St6galnac3 Serpinb8 Serine protease inhibitor 8 AK048230 Rnf152 RING finger protein 152 AK035832 AI894139 Weakly similar to zinc finger protein ZNF65 AK133549 Unc13a Weakly similar to Munc- 13 BC058348 Dlgap1 Disks large-associated protein 1 (DAP-1) AK163689 Klf16 Kruppel-like factor 16 NM_078477 Sodium/calcium exchanger 1 precursor NM_011406 Slc8a1 (Na(+)/Ca(2+)-exchange protein 1) Early growth response 2 NM_010118 Egr2 Dhcr24 24-dehydrocholesterol reductase NM_053272 UDP-N-acetyl-alpha-DNM_173030 Galnt13 galactosamine:polypeptide Mesenchymal stem cell protein DSC54 AK046456 AK046456 homolog NM_172803 Dock4 Dedicator of cytokinesis 4 BC025575 Hypothetical protein LOC217219 NM_199200 Ncald Neurocalcin delta NM_134094 Neurofilament triplet M protein (160 kDa AK051696 Nef3 neurofilament protein) NM_009717 Neurod6 Neurogenic differentiation 6 Sema7a Semaphorin 7A NM_011352 Syt16 Synaptotagmin 14-like AK137129 Rims3 Hypothetical protein AK122226 Sodium channel, voltage-gated, type I, beta AK028238 AK028238 polypeptide NM_025404 Arfl4 ADP-ribosylation factor 4-like Hypothetical P-loop containing nucleotide AK034089 triphosphate hydrolases structure containing AK034089 protein BC103530 Cckbr Gastrin/cholecystokinin type B receptor Itpr1 Inositol 1,4,5-triphosphate receptor 1 AK049015 BC089626 Protein C6orf142 homolog BC089626 Eda Ectodysplasin A (EDA protein homolog) AJ243657 Protein kinase, cGMP-dependent, type I NM_001013833 Prkg1 alpha U7 snRNA-associated Sm-like protein Lsm11 NM_028185 Lsm11 3-hydroxy-3-methylglutaryl-Coenzyme A Hmgcr NM_008255 reductase NM_177056 A230078I05Rik Hypothetical protein LOC319998 5730507A09Rik Hypothetical protein LOC70638 NM_183087 Potassium voltage gated channel, ShawNM_001025581 Kcnc2 related SUPPLEMENTARY INFORMATION Negr1 92 0 91 1 0 28.02 44.28 0.63 AK144913 2 0 2 0 0 2.04 3.48 0.59 -2.09 60 2 10 1 15 83 0 0 0 0 0 2 1 0 60 2 10 1 13 82 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1.31 1.45 0.64 3.61 40.18 25.05 14.71 2.32 2.56 1.11 5.65 64.15 39.32 21.28 0.56 0.57 0.58 0.64 0.63 0.64 0.69 -2.09 -2.09 -2.08 -2.08 -2.08 -2.06 -2.06 36 0 36 0 0 8.61 12.73 0.68 -2.06 0 0 0 0 0 0 0 0 0 0 3.64 6.85 5.59 10.00 0.65 0.69 -2.06 -2.06 26 0 26 0 0 9.46 14.28 0.66 -2.06 20 0 20 0 0 1.63 2.86 0.57 -2.05 47 0 34 0 0 0 46 0 34 0 0 0 1 0 0 14.46 17.20 35.45 20.85 25.89 55.75 0.69 0.66 0.64 -2.05 -2.05 -2.03 Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience -2.09 0 0 0 0 0 43.91 69.36 0.63 -2.03 0 1 17 7 0 0 0 0 0 1 17 7 0 0 0 0 0 0 0 0 9.52 26.85 10.45 22.53 14.29 41.81 15.29 34.57 0.67 0.64 0.68 0.65 -2.02 -2.02 -2.02 -2.02 3 0 3 0 0 35.01 55.05 0.64 -2.02 0 0 0 0 0 14.88 21.39 0.70 -2.02 6 0 6 0 0 1.05 1.86 0.57 -2.02 0 31 9 1 0 1 0 0 0 28 8 1 0 0 1 0 0 2 0 0 7.29 69.56 3.45 0.71 10.40 109.54 5.26 1.20 0.70 0.63 0.66 0.59 -2.02 -2.01 -2.01 -2.01 46 0 46 0 0 0.77 1.31 0.59 -2.01 0 0 0 0 0 0.99 1.72 0.57 -2.01 2 0 1 1 0 13.27 18.85 0.70 -2.01 0 2 0 0 0 2 0 0 0 0 7.82 4.64 11.29 7.12 0.69 0.65 -2.00 -2.00 27 0 26 1 0 8.75 12.79 0.68 -2.00 21 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Table 3 ncRNAs with changes in expression levels upon TDP-43 depletion in adult mouse brain RNAdb index LIT3360 LIT1822 LIT1602 LIT1604 LIT3442 LIT3440 LIT3441 LIT1799 LIT1623 LIT1749 LIT3409 LIT3410 LIT3411 LIT3368 LIT3422 LIT3369 LIT3366 LIT3371 LIT2029 LIT2028 LIT3352 LIT3351 LIT3362 LIT1787 LIT1651 LIT1387 LIT1111 LIT1671 LIT3367 LIT3323 LIT1806 LIT1647 name reverse delta5-desaturase RNA (Revdelta5ase) Kruppel-type zinc-finger protein ZIM3-like mRNA, complete sequence Tsix gene domesticus antisense RNA from the Xist locus, complete sequence C030002C11Rik gene D630033A02Rik gene 2210403K04Rik gene clone MBI-109 miscellaneous RNA, partial sequence Rian mRNA, imprinted gene Mirg mRNA Otx2 opposite strand (Otx2OS) RNA, variant isoform Otx2 opposite strand (Otx2OS) RNA, variant isoform Otx2 opposite strand (Otx2OS) RNA, variant isoform RIKEN cDNA 2810037O22 gene Dleu2 mRNA, partial sequence, alternatively spliced retinal noncoding RNA 3 (RNCR3) RIKEN cDNA 3010002C02 gene retinal noncoding RNA 1 (RNCR1) vault RNA, complete sequence vault-associated RNA sequence metastasis associated lung adenocarcinoma transcript 1 (non-coding RNA), mRNA (cDNA clone IMAGE:3582796) metastasis associated lung adenocarcinoma transcript 1 (non-coding RNA), mRNA (cDNA clone IMAGE:3601939) Hepcarcin RNA, complete sequence clone MBI-32 miscellaneous RNA, partial sequence clone MBII-386 miscellaneous RNA, partial sequence U22 snoRNA host gene (UHG) gene, complete sequence empty spiracles 2 opposite strand (EMX2OS) Nespas mRNA RIKEN cDNA 2610100L16 gene RNA component of mitochondrial RNAase P (Rmrp) on chromosome 4. clone MBII-109 miscellaneous RNA, partial sequence clone MBII-367 miscellaneous RNA, partial sequence Chr Start Stop RPKM in strand TDP-43 KD RNA-seq RPKM in RPKM in control saline Change chr19 10249682 10250370 - 3.00 0.12 0.12 up chr7 6559771 6580748 - 0.62 0.01 0.01 up chrX 99634235 99687675 + 4.45 0.08 0.11 up chrX 99634235 99687675 + 4.45 0.08 0.11 up chr1 196724832 196738628 chr10 122388136 122388891 chr11 75277842 75282884 + + 2.96 0.64 1.29 8.63 1.72 3.58 6.81 1.95 3.11 down down down chr12 90196061 90196315 + 0.19 2.00 1.95 down chr12 110093968 110100034 chr12 110182787 110197261 + + 6.43 2.12 20.01 7.39 17.72 6.70 down down chr14 47591238 47715298 + 0.01 0.02 0.01 down chr14 47592109 47641765 + 0.01 0.02 0.01 down chr14 47595527 47783406 + 0.01 0.02 0.01 down chr14 59340093 59342302 + 4.89 15.56 18.00 down chr14 60557141 60636460 - 0.39 0.88 1.02 down chr14 63542224 63547029 + 11.49 42.49 35.15 down chr16 8767567 + 6.62 13.75 18.15 down 8766673 chr17 85515685 85526691 - 0.61 1.41 2.15 down chr18 36927839 36927979 + 2.96 8.92 3.67 down chr18 36927840 36927980 + 2.96 8.92 3.67 down chr19 5537030 5801973 - 3.61 13.25 10.70 down chr19 5625873 5801971 - 5.29 19.76 15.91 down chr19 5795689 5802671 - 123.89 522.39 428.33 down chr19 5796492 5796672 - 97.17 415.23 430.62 down chr19 5801468 5801548 - 53.60 219.41 222.85 down chr19 8791184 8793485 + 2.73 7.76 7.07 down 59511885 - 0.08 0.21 0.21 down chr2 173909368 173909996 - 0.46 1.10 1.68 down chr3 17982103 17991742 + 2.07 4.94 7.09 down chr4 43513885 43514159 - 11.64 40.17 29.24 down chr4 108974503 108974582 - 15.05 38.10 3.63 down chr4 131581604 131581673 + 2.12 8.30 8.88 down chr19 59478353 Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience 22 Polymenidou et al LIT1313 LIT1314 LIT1808 LIT1345 LIT3370 LIT3450 LIT1225 LIT1226 LIT2020 LIT2015 LIT3416 LIT3432 LIT1675 LIT1355 LIT1629 LIT1703 LIT3324 LIT3329 LIT1010 LIT1761 LIT1009 LIT1760 LIT1008 LIT1006 LIT1003 LIT1001 RNA transcript from U17 small nucleolar RNA host gene U17HG gene clone MBII-123 miscellaneous RNA, partial sequence makorin 1 pseudogene mRNA, partial sequence retinal noncoding RNA 2 (RNCR2) Evf1 RNA, full length Mit1/Lb9 mRNA, partial sequence molossinus Mit1/Lb9 gene, partial sequence Y3 scRNA gene, partial sequence. Y1 scRNA gene, partial sequence ROSA 26 transcription 1 mRNA sequence UBE3A-ATS transcript containing exon L Ipw gene Ipw mRNA, partial sequence Pwcr1 mRNA, complete sequence KvLQT1-AS allele (LIT1) 7SK class III RNA gene inactive X specific transcripts (Xist) on chromosome X, transcript variant 1 inactive X specific transcripts (Xist) on chromosome X, transcript variant 2 Jpx mRNA, partial sequence Jpx mRNA, partial sequence Jpx mRNA, partial sequence Jpx mRNA, partial sequence Jpx mRNA, partial sequence Ftx noncoding RNA, exons 6-7 Ftx noncoding RNA, exons 6-7 Ftx noncoding RNA, exons 6-7 SUPPLEMENTARY INFORMATION chr4 131624009 131625709 - 1.42 3.89 3.02 down chr4 131624010 131625709 - 1.42 3.89 3.02 down chr4 148151418 148151511 - 17.86 42.33 13.45 down chr5 89913208 89913907 - 0.48 1.69 1.18 down chr5 112453533 112456585 - 9.34 37.31 18.35 down chr6 6771701 6811661 - 0.35 1.27 1.82 down chr6 30736982 30738860 + 12.42 34.37 39.93 down chr6 30738334 30738754 + 13.80 39.68 46.32 down chr6 47711224 47711289 + 741.35 2636.11 1178.13 down chr6 47717676 47717749 - 703.99 2681.21 1067.28 down chr6 113036182 113042974 - 1.12 2.43 2.52 down chr7 59091716 59094070 - 1.32 4.60 3.67 down chr7 chr7 59483753 59487027 59544353 59544354 - 0.03 0.39 0.10 1.50 0.08 1.23 down down chr7 59739300 59744327 - 0.07 0.26 0.12 down chr7 chr9 143103111 143107841 77960986 77961317 - 0.23 37.46 0.70 124.99 0.62 80.77 down down chrX 99663092 99685952 - 12.17 37.86 38.29 down chrX 99663092 99685952 - 12.17 37.86 38.29 down chrX chrX chrX chrX chrX chrX chrX chrX 99696298 99696350 99696350 99696383 99696384 99809719 99810272 99810325 99696574 99705674 99699246 99705680 99705680 99817912 99826413 99812065 + + + + + - 1.64 0.42 0.33 0.39 0.39 0.19 0.19 0.17 5.92 1.04 0.98 0.94 0.94 0.42 0.38 0.47 4.94 0.28 3.13 0.90 0.90 0.63 0.50 4.06 down down down down down down down down Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience 23 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Table 4 Enriched Gene Ontology terms in TDP-43 target RNAs that are downregulated upon TDP-43 depletion in adult brain Gene Ontology Category Molecular Function Cellular Component Biological Process Enriched Gene Ontology Term GO:0005216 GO:0022838 GO:0015267 GO:0022803 GO:0005261 GO:0022836 GO:0046873 GO:0005509 GO:0022843 GO:0005244 GO:0022832 GO:0005262 GO:0022834 GO:0015276 GO:0031420 GO:0030955 GO:0005267 GO:0005249 GO:0005246 GO:0005245 GO:0016247 GO:0043167 GO:0046872 GO:0043169 GO:0008066 GO:0005886 GO:0031224 GO:0045202 GO:0044456 GO:0016021 GO:0031225 GO:0044459 GO:0030054 GO:0042734 GO:0045211 GO:0034703 GO:0005887 GO:0043005 GO:0034702 GO:0031226 GO:0014069 GO:0006811 GO:0007268 GO:0030001 GO:0019226 GO:0007267 GO:0050877 GO:0006812 GO:0044057 GO:0006816 GO:0015674 GO:0050804 GO:0046903 GO:0048878 GO:0051969 GO:0006873 GO:0031644 GO:0055082 GO:0032940 GO:0006813 GO:0050801 GO:0006887 GO:0019725 GO:0030182 GO:0015672 GO:0006836 GO:0042391 GO:0007610 GO:0048666 GO:0031175 GO:0007186 ion channel activity substrate specific channel activity channel activity passive transmembrane transporter activity cation channel activity gated channel activity metal ion transmembrane transporter activity calcium ion binding voltage-gated cation channel activity voltage-gated ion channel activity voltage-gated channel activity calcium channel activity ligand-gated channel activity ligand-gated ion channel activity alkali metal ion binding potassium ion binding potassium channel activity voltage-gated potassium channel activity calcium channel regulator activity voltage-gated calcium channel activity channel regulator activity ion binding metal ion binding cation binding glutamate receptor activity plasma membrane intrinsic to membrane synapse synapse part integral to membrane anchored to membrane plasma membrane part cell junction presynaptic membrane postsynaptic membrane cation channel complex integral to plasma membrane neuron projection ion channel complex intrinsic to plasma membrane postsynaptic density ion transport synaptic transmission metal ion transport transmission of nerve impulse cell-cell signaling neurological system process cation transport regulation of system process calcium ion transport di-, tri-valent inorganic cation transport regulation of synaptic transmission secretion chemical homeostasis regulation of transmission of nerve impulse cellular ion homeostasis regulation of neurological system process cellular chemical homeostasis secretion by cell potassium ion transport ion homeostasis exocytosis cellular homeostasis neuron differentiation monovalent inorganic cation transport neurotransmitter transport regulation of membrane potential behavior neuron development neuron projection development G-protein coupled receptor protein signaling pathway Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience Number of listed genes in term Total number of genes in term 21 21 21 21 18 19 19 26 11 12 12 8 8 8 10 8 8 7 4 4 4 46 45 45 4 61 77 21 15 67 12 33 17 6 9 7 13 12 8 13 5 24 15 19 16 16 20 19 13 10 10 9 11 14 9 12 9 12 10 9 12 8 13 13 11 7 8 12 11 10 12 164 165 169 169 115 138 146 391 67 94 94 34 46 46 103 63 65 52 8 12 13 2204 2157 2177 17 1375 2367 220 143 2262 106 831 272 21 70 48 205 182 76 214 37 360 124 228 157 165 284 275 118 72 92 70 118 205 73 152 76 156 103 85 168 66 209 221 158 52 79 200 170 141 207 % of listed Corrected genes in p-value* term 12.8 12.7 12.4 12.4 15.7 13.8 13.0 6.6 16.4 12.8 12.8 23.5 17.4 17.4 9.7 12.7 12.3 13.5 50.0 33.3 30.8 2.1 2.1 2.1 23.5 4.4 3.3 9.5 10.5 3.0 11.3 4.0 6.3 28.6 12.9 14.6 6.3 6.6 10.5 6.1 13.5 6.7 12.1 8.3 10.2 9.7 7.0 6.9 11.0 13.9 10.9 12.9 9.3 6.8 12.3 7.9 11.8 7.7 9.7 10.6 7.1 12.1 6.2 5.9 7.0 13.5 10.1 6.0 6.5 7.1 5.8 6.64E-12 3.74E-12 3.97E-12 3.97E-12 7.19E-12 9.42E-12 2.13E-11 1.03E-09 5.31E-07 1.20E-06 1.20E-06 6.74E-06 5.34E-05 5.34E-05 1.94E-04 3.87E-04 4.42E-04 1.09E-03 2.00E-03 7.09E-03 8.62E-03 8.83E-03 1.06E-02 1.25E-02 1.61E-02 1.32E-11 2.20E-09 5.97E-08 7.11E-06 1.10E-05 6.04E-05 2.46E-04 4.13E-04 4.61E-04 4.62E-04 2.64E-03 3.37E-03 4.33E-03 4.16E-03 4.01E-03 3.85E-02 1.23E-06 1.12E-06 1.19E-06 1.50E-06 2.38E-06 3.25E-06 9.59E-06 1.34E-05 8.28E-05 5.95E-04 5.56E-04 5.54E-04 5.46E-04 6.00E-04 7.05E-04 7.09E-04 7.96E-04 8.34E-04 1.37E-03 1.35E-03 1.81E-03 1.96E-03 3.21E-03 3.47E-03 3.34E-03 4.63E-03 4.86E-03 5.44E-03 6.04E-03 5.91E-03 24 Polymenidou et al GO:0042592 homeostatic process GO:0051899 membrane depolarization GO:0003001 generation of a signal involved in cell-cell signaling GO:0006874 cellular calcium ion homeostasis GO:0055074 calcium ion homeostasis GO:0055066 di-, tri-valent inorganic cation homeostasis GO:0050905 neuromuscular process GO:0007626 locomotory behavior GO:0048812 neuron projection morphogenesis GO:0006875 cellular metal ion homeostasis GO:0048667 cell morphogenesis involved in neuron differentiation GO:0055065 metal ion homeostasis GO:0007611 learning or memory GO:0007612 learning GO:0007628 adult walking behavior *The p-value indicated is corrected for multiple testing using the Benjamini-Hochberg method. Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience SUPPLEMENTARY INFORMATION 15 5 6 6 6 7 5 8 8 6 8 6 6 5 4 329 26 48 52 54 82 33 113 114 57 115 59 60 38 19 4.6 19.2 12.5 11.5 11.1 8.5 15.2 7.1 7.0 10.5 7.0 10.2 10.0 13.2 21.1 8.13E-03 1.27E-02 1.56E-02 2.19E-02 2.53E-02 2.75E-02 2.76E-02 2.77E-02 2.84E-02 2.82E-02 2.84E-02 3.14E-02 3.30E-02 3.94E-02 4.33E-02 25 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Table 5 TDP-43 binding and regulation on transcripts encoding for proteins involved in RNA metabolism Number of CLIP-seq clusters in introns in exons in 3'UTR RPKM in TDP-43 KD RPKM in control 309 1 308 0 0 52.07 53.16 0.98 -0.02 More exclusion (chr16:7307308-7307361) Adenosine deaminase, RNA-specific, B1 NM_001024839 14 0 14 0 0 21.21 27.16 0.78 -1.24 No Senataxin NM 198033 Ataxin-2 NM 009125 CUG triplet repeat, RNA-binding protein NM_017368 1 ELAV NM 010485 ELAV-like protein 2 (Hu-antigen B) BC049125 ELAV-like protein 3 NM 010487 ELAV-like 4 isoform a NM 010488 Elongation protein 3 homolog AK088457 Ewing sarcoma homolog AK147909 Fus/TLS protein BC011078 FUS interacting protein AK169071 Muscleblind-like 1 NM 020007 Muscleblind-like 2 AK164372 Neuro-oncological ventral antigen 1 NM 021361.1 Neuro-oncological ventral antigen 2 NM 001029877. 1 18 0 0 1 18 0 0 0 0 13.72 18.71 12.97 15.96 1.06 1.17 0.04 0.58 No No 17 0 17 0 0 34.87 37.92 0.92 -0.35 No 2 17 7 8 3 8 4 3 19 14 15 7 0 0 0 0 0 0 0 0 0 0 0 0 2 17 7 8 3 6 0 1 19 14 13 6 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 2 1 2 0 0 2 1 17.20 9.71 35.32 14.14 15.61 44.86 92.88 16.64 39.54 68.83 7.62 15.04 16.79 11.13 35.20 14.16 17.54 48.66 129.56 17.47 34.50 66.40 16.97 36.85 1.02 0.87 1.00 1.00 0.89 0.92 0.72 0.95 1.15 1.04 0.45 0.41 -0.09 -0.91 0.05 -0.23 -0.81 -0.31 -1.46 -0.46 0.69 0.24 3 0 3 0 0 21.64 21.98 0.98 -0.17 Protein A2bp1/Fox1 Ataxin-2 binding protein 1 Cugbp1 Elavl1 Elavl2 Elavl3 Elavl4 Elp3 Ewsr1 Fus Fusip1 Mbnl1 Mbnl2 Nova1 Nova2 AY659958 Ratio Z KD/control score Splicing changes Change upon TDP-43 depletion (exon coordinates) in 5'UTR Gene symbol Adarb1 (ADAR2) Als4 Atxn2 Refseq/mRNA Total identifier RNA-seq Ptbp2 Polypyrimidine tract binding protein 2 NM_019550 Raly hnRNP-associated with lethal yellow NM_023130 9 0 9 0 0 17.42 19.52 0.89 -0.73 Rbm9/Fox2 Sfrs9 Fox-1 homolog Splicing factor, arginine/serine rich 9 TAF15 RNA polymerase II, TATA box binding protein (TBP)-associated factor, 68 kDa Cytotoxic granule-associated RNA binding protein AK044929 NM 025573 32 1 0 0 32 1 0 0 0 0 45.73 13.12 54.32 12.28 0.84 1.07 -0.72 0.08 No No No No No No No No No No No No More exclusion (chr3:119716615-119716649) More exclusion (chr2:154551351-154551399) No No AK011843 5 0 5 0 0 26.82 28.03 0.96 -0.22 No NM_011585 6 0 6 0 0 26.46 21.48 1.23 0.92 More inclusion (chr6:86384734-86384767) Taf15 Tia1 Coordinates on mm8 mouse genome. *Human orthologue gene. Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience 26 Polymenidou et al SUPPLEMENTARY INFORMATION Supplementary Table 6 TDP-43 binding and regulation on transcripts associated with disease Number of CLIP-seq clusters Gene symbol Adarb1 (ADAR2)* Als2 Als4 Ang1 App Ar Atxn1 Atxn10 Atxn2 Atxn7 Cacna1c Chmp2b Dctn1 Elp3 Fbxo7 Fgf14 Fig4 Fmr1 Fus/Tls Fxn Gria2 Gria3 Grik2 Grin1 Grin2a Grin2c Grn Hdac6 Hdh Ighmbp2 Kcnma1 Protein Refseq/mRNA identifier Adenosine deaminase, RNANM_001024839 specific, B1 Alsin AK020484 Senataxin NM 198033 Angiogenin NM 007447 Amyloid beta precursor protein AK148439 NM_013476 Androgen receptor Ataxin-1 NM 009124 Ataxin-10 NM 016843 Ataxin-2 NM 009125 Ataxin-7 NM 139227 Calcium channel, voltageNM_009781 dependent, L type Chromatin modifying protein 2B NM 026879 Dynactin 1 AK157867 Elongation protein 3 homolog AK088457 F-box only protein 7 BC032153 Fibroblast growth factor 14 NM_207667 isoform b Sac domain-containing inositol NM_133999 phosphatase 3 Fragile X mental retardation NM_008031 protein 1 Fus/Tls protein BC011078 Frataxin BC058533 Glutamate receptor, ionotropic, NM_001039195 AMPA2 (alpha 2) Glutamate receptor, ionotropic, NM_016886 AMPA3 (alpha) Glutamate receptor, ionotropic, NM_010349 kainate 2 (beta) Glutamate receptor, ionotropic, BC039157 NMDA1 (zeta 1) Glutamate receptor, ionotropic, NM_008170 NMDA2A (epsilon) Glutamate receptor, ionotropic, NM_010350 NMDA2C (epsilon) Progranulin NM 008175 Histone deacetylase 6 NM 010413 Huntingtin NM 010414 Immunoglobulin mu binding NM_009212 protein 2 Calcium-activated potassium channel alpha subunit 1 U09383 (Calcium-activated potassium channel, subfamily M, alpha subunit 1) Total in 5'UTR in introns in exons RNA-seq in 3'UTR Ratio RPKM in RPKM in TDP-43 KD control KD/control Splicing changes Z score Change upon TDP-43 depletion (exon coordinates) 14 0 14 0 0 21.21 27.16 0.78 -1.24 No 8 1 0 34 0 64 9 18 4 0 0 0 0 0 0 0 0 0 8 1 0 29 0 64 9 18 3 0 0 0 3 0 0 0 0 0 0 0 0 2 0 0 0 0 1 6.67 13.72 21.94 219.61 2.90 16.02 43.25 18.71 6.55 6.94 12.97 9.78 263.72 2.73 19.97 40.78 15.96 6.70 0.96 1.06 2.24 0.83 1.06 0.80 1.06 1.17 0.98 -0.55 0.04 3.79 -0.77 -0.33 -1.31 0.30 0.58 -0.45 No No No No No No No No No 77 0 77 0 0 5.57 9.56 0.58 -2.65 No 1 8 3 5 0 0 0 0 1 8 3 5 0 0 0 0 0 0 0 0 16.51 40.37 15.61 9.99 12.82 50.11 17.54 7.35 1.29 0.81 0.89 1.36 0.96 -0.93 -0.81 0.95 No No No No 85 0 85 0 0 6.99 14.99 0.47 -3.76 No 3 0 3 0 0 10.36 8.58 1.21 0.50 No 0 0 0 0 0 17.03 17.38 0.98 -0.31 No 4 0 0 0 0 0 3 0 1 0 92.88 1.57 129.56 1.10 0.72 1.43 -1.46 0.41 No No 18 1 14 0 3 105.96 110.65 0.96 -0.12 No More exclusion (chrX:37898880-37914127) 18 0 17 0 1 33.82 45.41 0.74 -1.35 52 0 52 0 0 17.00 27.85 0.61 -2.46 No 3 0 3 0 0 62.95 83.99 0.75 -1.25 More exclusion (chr2:25133351-25133414) 83 0 83 0 0 2.19 4.23 0.52 -2.56 No 3 0 3 0 0 7.38 11.76 0.63 -2.45 No 1 3 10 0 0 0 0 3 9 0 0 1 1 0 0 70.17 8.98 8.52 23.12 10.54 16.55 3.04 0.85 0.51 5.22 -1.05 -3.29 no No No 1 0 1 0 0 3.26 2.98 1.09 -0.19 No 96 0 96 0 0 8.34 14.36 0.58 -2.66 No Kif5a Kinesin family member 5A NM_001039000 5 0 2 0 3 261.61 364.81 0.72 -1.45 No Kifap3 Kinesin-associated protein 3 NM 010629 5 0 5 0 0 43.38 50.66 0.86 -0.65 No Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience 27 Polymenidou et al Lrrk2 Mapt Mecp2 Mjd/Atxn3 Nefl Nlgn1 Nrxn1 SUPPLEMENTARY INFORMATION Leucine-rich repeat kinase 2 NM 025730 Tau protein NM 001038609 Methyl CpG binding protein 2 AK158664 Mjd/Ataxin-3 protein BC087880 Neurofilament, light polypeptide NM 010910 Neuroligin 1 NM 138666 Neurexin-I-alpha precursor AB093249 (Neurexin I-alpha) Nrxn2 Hypothetical Laminin G Nrxn3 Neurexin III Optn Optineurin Park2 Parkin Park7 DJ-1 protein Pink1 PTEN induced putative kinase 1 Pla2g6 Phospholipase A2, group VI Prnp Prion protein Psen1 Presenilin 1 Psen2 Presenilin 2 Slc1a2/Glt1 (EAAT2)* Solute carrier family 1 Slc1a3 (EAAT1)* Solute carrier family 1 Slc1a6 (EAAT4)* Solute carrier family 1 Smn1 Survival motor neuron protein Snca Synuclein, alpha Sncb Synuclein, beta Sod1 Superoxyde dismutase 1 Sort1 Sortilin 1 Spg11 Tbp Tnrc15/Gigyf2 4 10 4 3 1 102 0 0 0 0 0 0 4 10 4 3 0 101 0 0 0 0 0 1 0 0 0 0 1 0 11.28 80.18 29.52 9.65 40.04 8.55 14.49 97.03 30.32 8.49 61.29 16.28 0.78 0.83 0.97 1.14 0.65 0.53 -1.41 -0.80 -0.15 0.21 -1.88 -3.19 No No No No No No 162 0 162 0 0 18.66 33.28 0.56 -2.76 No AK163904 28 0 27 0 1 41.94 45.65 0.92 -0.34 BC060719 178 0 178 0 0 16.25 37.23 0.44 -3.93 NM 181848 AF250294 NM 020569 NM 026880 NM 016915 NM 011170 NM 008943 AF038935 NM 011393 NM 148938 NM 009200 BC045158 AK014472 NM 033610 BC057074 1 39 0 0 0 4 4 1 28 2 3 0 3 3 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 39 0 0 0 2 3 1 26 2 3 0 2 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 6.07 0.63 15.89 81.90 2.70 179.48 18.45 4.09 139.06 134.38 1.27 6.40 37.26 54.68 49.95 5.06 2.18 14.92 82.18 3.00 153.18 16.55 3.29 156.13 97.73 1.32 5.76 42.97 68.33 45.48 1.20 0.29 1.06 1.00 0.90 1.17 1.11 1.24 0.89 1.37 0.96 1.11 0.87 0.80 1.10 0.29 -3.78 0.07 0.06 -0.81 0.81 0.34 0.23 -0.46 1.55 -0.63 0.01 -0.63 -0.95 0.50 NM_019972 8 0 8 0 0 75.67 86.65 0.87 -0.55 4.75 7.33 14.87 4.26 6.06 13.80 1.12 1.21 1.08 -0.04 0.35 0.12 More exclusion (chr19:6513715-6513805) More inclusion (chr12:90604171-90604261) No No No No No No No No No No No No No No No More inclusion (chr3:108483527-108483626) No No No 23.19 19.27 1.20 0.80 No 93.43 90.90 1.03 0.20 No 23.29 26.90 0.87 -0.73 No 11.57 9.91 1.17 0.43 No Spatacsin BC019404 4 0 4 0 0 TATA binding protein BC016476 0 0 0 0 0 Tnrc15 protein BC027137 15 1 14 0 1 Ubiquitin protein ligase E3A Ube3a NM_173010 3 0 3 0 0 isoform 1 Ubiquitin carboxy-terminal Uchl1 NM_011670 0 0 0 0 0 hydrolase L1 Vesicle-associated membrane Vapb NM_019806 7 2 5 0 2 protein, associated Valosin containing protein NM_009503 1 0 1 0 0 Vcp Z-scores between -1.96 and 1.96 represent p-values greater than 0.05. Coordinates on mm8 mouse genome. *Human orthologue gene. Nature Neuroscience: doi:10.1038/nn.2779 nature neuroscience 28