Identification of novel epigenetic biomarkers in colorectal cancer
Transcription
Identification of novel epigenetic biomarkers in colorectal cancer
Identification of novel epigenetic biomarkers in colorectal cancer, GLDC and PPP1R14A. Deeqa Ahmed Mohamed Ali Thesis for the master´s degree in Molecular Biosciences at Department of Molecular Biosciences (IMBV), Faculty of Mathematics and Natural sciences UNIVERSITY OF OSLO February 2010 1 Acknowledgements The work presented in this thesis was carried out at the project Group of Epigenetics, Department of Cancer Prevention, Institute for Cancer Research, Oslo University Hospital-Radiumhospitalet from July 2008 to February 2010. First and foremost, I would like to express my sincere gratitude to my supervisor Guro Elisabeth Lind, for her encouragement, guidance and invaluable assistance throughout the project. Her knowledge and enthusiasm for the field of epigenetics has been a true inspiration. I would also like to thank my co-supervisor and head of department, Professor Ragnhild Lothe, for including me in her outstanding academic department and for contributing with objective comments. I am grateful to my colleagues for creating a wonderful social environment and for always being available to answer my questions. I would especially like to thank Hilde, both for helping out in the lab as well as contributing in determining the final scores, Stine for assisting me with the statistical analyses, and Anne Cathrine and Marianne, for our many scientific discussions and for always putting a smile on my face. This thesis could not have been completed had it not been for the support and love I have received from my friends and family. A special thanks to my parents for supporting my dreams and aspirations, and for making me believe that I can achieve anything I set my mind to. Oslo, February 2010 Deeqa Ahmed M.Ali 2 Table of contents ACKNOWLEDGEMENTS .................................................................................................................2 TABLE OF CONTENTS .....................................................................................................................3 SUMMARY ...........................................................................................................................................6 ABBREVIATIONS ...............................................................................................................................8 GENE SYMBOLS ..............................................................................................................................10 1. INTRODUCTION ....................................................................................................................11 1.1 GENETIC AND EPIGENETIC ALTERATIONS IN CARCINOGENESIS ...............................................11 1.2 EPIGENETIC REGULATION OF GENE EXPRESSION .....................................................................13 1.3 1.4 1.2.1 Defining epigenetics....................................................................................................13 1.2.2 The interplay of epigenetic regulators ........................................................................20 1.2.3 RNA-mediated gene silencing .....................................................................................23 1.2.4 Epigenetics: Nature or nurture or both? ....................................................................25 COLORECTAL CANCER............................................................................................................26 1.3.1 Molecular developmental pathways............................................................................27 1.3.2 Histopathology and morphological pathways ............................................................29 1.3.3 Tumour classification, treatment and outcome ...........................................................31 CLINICAL RELEVANCE OF MOLECULAR BIOMARKERS .............................................................32 2. AIMS .........................................................................................................................................34 3. MATERIALS AND METHODS .............................................................................................35 3.1 MATERIALS ............................................................................................................................35 3.1.1 Colon cancer cell lines................................................................................................35 3.1.2 Tissue samples – Colorectal carcinomas and normal mucosa ...................................35 3 3.2 4. METHYLATION-SPECIFIC METHODOLOGIES ............................................................................ 36 3.2.1 Strategy to select novel DNA methylation candidate genes ........................................ 36 3.2.2 Bisulfite modification .................................................................................................. 38 3.2.3 Qualitative methylation-specific polymerase chain reaction. ..................................... 39 3.2.4 Quantitative real-time methylation-specific polymerase chain reaction .................... 43 3.2.5 Capillary electrophoresis sequencing......................................................................... 46 3.2.6 Bisulfite sequencing .................................................................................................... 48 3.2.7 Statistics ...................................................................................................................... 49 RESULTS.................................................................................................................................. 51 4.1 QUALITATIVE METHYLATION ANALYSES OF CANDIDATE GENES IN VITRO AND IN VIVO ........... 51 4.2 QUANTITATIVE METHYLATION PROFILES OF GLDC AND PPP1R14A .................................... 53 4.3 CONCORDANCE OF CONVENTIONAL MSP AND QUANTITATIVE REAL-TIME MSP .................... 56 4.4 BISULFITE SEQUENCING CONFIRMS THE PROMOTER METHYLATION STATUS OF GLDC AND PPP1R14A 60 4.5 5. ASSOCIATION OF TUMOUR METHYLATION WITH GENETIC AND CLINICO-PATHOLOGICAL FEATURES 63 DISCUSSION ........................................................................................................................... 65 5.1 METHODOLOGICAL CONSIDERATIONS .................................................................................... 65 5.1.1 Methylation-specific polymerase chain reaction ........................................................ 65 5.1.2 Bisulfite sequencing .................................................................................................... 67 5.2 CELL LINES VERSUS SOLID TUMOURS ..................................................................................... 68 5.3 NOVEL EPIGENETICALLY DEREGULATED GENES IN COLORECTAL CANCER ............................. 69 5.4 EARLY DETECTION AND DIAGNOSTICS ................................................................................... 74 6. CONCLUSIONS ...................................................................................................................... 79 7. FUTURE PERSPECTIVES .................................................................................................... 80 4 8. REFERENCE LIST .................................................................................................................82 APPENDIX I - TUMOUR SAMPLES ..............................................................................................90 APPENDIX II – NORMAL TISSUE SAMPLES ............................................................................92 APPENDIX III – QUALITATIVE MSP ANALYSES ....................................................................94 APPENDIX IV – QUANTITATIVE MSP ANALYSES .................................................................96 5 Summary Colorectal cancer is one of the most common malignancies in the Western world, with an incidence of 3500 new cases per year in Norway alone. There is a need for improved early diagnostics as well as more precise cancer diagnosis to better guide the choice of treatment. CpG island hypermethylation of tumour-suppressor genes has been established as a key molecular event in colorectal cancer. Furthermore, DNA hypermethylation occurs early during tumor development, suggesting that it could be used as a molecular marker for early detection of the disease. Determining the methylation frequencies of target genes in colorectal cancer could therefore help discover novel biomarkers with a diagnostic potential. The objective of this study was to identify novel epigenetic biomarkers in colorectal cancer. A set of candidate genes were selected after treatment of colon cancer cell lines with AZA and TSA, and subsequent microarray gene expression analysis. Then, in silico analyses was performed on candidate genes to search for the presence of CpG islands in the promoter region of the gene. Ten genes were investigated in vitro for promoter hypermethylation by methylation-specific PCR in colon cancer cell lines (n = 20). Six of the ten genes were methylated in more than 14 of the cell lines and were subjected to an in vivo pilot methylation study of primary colorectal carcinomas (n = 20) and normal mucosa samples (n = 10). The two most promising genes, GLDC and PPP1R14A, were further investigated by quantitative real-time methylation-specific PCR in an extended series of malignant (n = 47) and normal (n = 49) colorectal tissue samples. Promoter hypermethylation of GLDC and PPP1R14A had a sensitivity of 60% and 57% in colorectal carcinomas, whereas normal mucosa samples were unmethylated for both genes, resulting in 100% specificity. Promoter methylation was independent of tumour stage, age and gender of the patients. PPP1R14A was 6 significantly more methylated in tumours with microsatellite instability and thus in tumours located on the right side of the colon. In the present study GLDC and PPP1R14A are identified as novel methylated gene targets in colorectal cancer. 7 Abbreviations ACF Aberrant crypt foci ATP Adenosine triphosphate AZA 5-aza-2´-deoxycytidine bp Base pairs CIMP CpG Island Methylator Phenotype CIN Chromosomal instability CpG Cytosine phosphate guanine CRC Colorectal cancer Ct Cycle treshold ddNTP Dideoxyribonucleotide triphosphate DNA Deoxyribonucleic acid DNMT DNA methyltransferase dNTP Deoxyribonucleotide triphosphate FAP Familial adenomatous polyposis FOBT Fecal occult blood test HNPCC Hereditary non-polyposis colorectal cancer ICF Immunodeficiency centromeric instability facial anomalies LOH Loss of heterozygosity LOI Loss of imprinting MBD Methyl binding domain MeCP2 Methyl CpG binding protein 2 MGB Minor groove binder miRNA Micro ribonucleic acid MMR DNA mismatch repair mRNA Messenger ribonucleic acid MSI Microsatellite instability MSP Methylation-specific polymerase chain reaction MSS Microsatellite stable nt Nucleotides PCR Polymerase chain reaction PMR Prosent methylated referance qMSP Quantitative methylation-specific polymerase chain reaction RNA Ribonucleic acid ROC Receiver Operating Characteristics siRNA Small interfering ribonucleic acid SWI/SNF SWItch/Sucrose NonFermentable TAE buffer Tris-acetate ethylenediaminetetraacetate buffer Tm Melting temperature 8 TSA Trichostatin A XIST X-inactive specific transcript 9 Gene symbols1 BNIP3 BRAF CBS CD44 CDKN2A CTNNB1 DDX43 EGFR GLDC H19 HRAS IGF2 IQCG KRAS MAL MGMT MLH1 MSH2 PEG10 PIK3CA PPP1R14 A PTEN RASSF4 RB RBP7 SEPT9 SFRP2 TFP12 TP53 VIM WDR21B WT1 BCL2/adenovirus E1B 19kDa interacting protein 3 v-raf murine sarcoma viral oncogene homolog B1 Cystathionine-beta-synthase CD44 molecule (Indian blood group Cyclin-dependent kinase inhibitor 2A (melanoma, p16, inhibits CDK4) Catenin (cadherin-associated protein), beta 1, 88kDa DEAD (Asp-Glu-Ala-Asp) box polypeptide 43 Epidermal growth factor receptor (erythroblastic leukemia viral (v-erb-b) oncogene homolog, avian) Glycine dehydrogenase (decarboxylating) H19, imprinted maternally expressed transcript (non-protein coding) v-Ha-ras Harvey rat sarcoma viral oncogene homolog Insulin-like growth factor 2 (somatomedin A) IQ motif containing G v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog Mal, T-cell differentiation protein O-6-methylguanine-DNA methyltransferase MutL homolog 1, colon cancer, nonpolyposis type 2 (E. coli) MutS homolog 2, colon cancer, nonpolyposis type 1 (E. coli) Paternally expressed 10 Phosphoinositide-3-kinase, catalytic, alpha polypeptide Protein phosphatase 1, regulatory (inhibitor) subunit 14A Phosphatase and tensin homolog Ras association (RalGDS/AF-6) domain family member 4 Retinoblastoma 1 Retinol binding protein 7, cellular Septin 9 Secreted frizzled-related protein 2 Tissue factor pathway inhibitor 2 Tumor protein p53 Vimentin DDB1 and CUL4 associated factor 4-like 1 Wilms tumor 1 1 Gene symbols and full names are approved by the HUGO Gene Nomenclature Committee (http://www.genenames.org). Approved gene symbols are used throughout the thesis. 10 1. Introduction 1.1 Genetic and epigenetic alterations in carcinogenesis A tight network of controls regulates the mechanisms that govern normal cell proliferation and homeostasis, and disruption of these controls may lead to successful tumour development. Hanahan and Weinberg have described six essential changes that constitute the hallmarks of cancer: insensitivity to anti-growth signals, evading apoptosis, self-sufficiency in growth signals, sustained angiogenesis, limitless replicative potential and tissue invasion and metastasis. It has long been accepted that genetic alterations can cause cancer, however, throughout the last decades the importance of epigenetic changes in initiation and progression of cancer has been widely acknowledged. The genetic and epigenetic processes seem to be interconnected in driving the development of tumours (Figure 1). Figure 1. Epigenetic plasticity and genetic lesions drives tumour progression. Cancer evolves through a series of genetic and epigenetic alterations. These acquired changes will eventually give rise to successive subclones with selective advantages over neighboring clones. The gained biological properties could include the six hallmarks of cancer, as well as other important survival or growth advantages. In 1976, Nowell proposed that tumour development is caused by acquired genetic variability in a cell which may allow natural selection of subgroups, resulting in monoclonal tumours [1]. This clonal expansion model for tumour development is now widely accepted. The model has similarities with Darwinian evolution, in which several successful changes and “survival of the fittest” cells may lead to a 11 Introduction monoclonal population. However, cytogenetic studies have shown that the acquisition of new mutant alleles by the cells in the population in some cases may result in genetic heterogeneity within the population, leading to polyclonality [2]. The cancer stem cell hypothesis was first proposed 150 years ago. Cell surface marker expression analysis indicates that cells of tumours can be sorted into a major and a minor population, where the latter constitutes less than 1% of the cells in the tumour [3]. The cells of the minor population display several abilities which resemble those of stem cells, i.e. self-renewal and differentiation, both crucial properties in driving malignancy. Self-renewal drives tumourigenesis, whereas differentiation contributes to the heterogeneity phenotype of the tumours. Because stem cells have an unlimited ability to proliferate, it is likely that the tumourigenic cancer stem cells are the drivers of multistep tumourigenesis [4]. The multistep tumourigenesis pathway is a consequence of alterations in three different types of gene families, proto-oncogenes, tumor-suppressor genes, and DNA repair genes. A proto-oncogene specifies proteins that stimulate cell growth, and when altered, it may increase the ability of the cell to divide extensively. Thus, proto-oncogenes can become cancer-causing oncogenes, having the ability to transform normal cells and induce cancer. Dominant gain-of-function mutation or hypomethylation of proto-oncogenes render the genes constitutively active or active under conditions where the wild type gene is not. Tumour-suppressor genes code for proteins that restrain cell growth, thereby reducing the possibility that a cell will develop into a tumour. Recessive loss-of-function mutations or hypermethylation, leading to partial or complete inactivation of tumour-suppressor genes, may reduce the ability of the genes to constrain cell proliferation. The DNA repair genes, also referred to as caretaker genes, are under normal circumstances responsible for maintaining the integrity of the genome [5]. A gain-of-function alteration in one allele of a proto-oncogene is sufficient to activate an oncogene, while both of the copies of a recessive tumour-suppressor gene must be eliminated to gain a phenotypic change. Alfred Knudson elegantly postulated this “two-hit” hypothesis 12 Introduction for inactivation of tumour-suppressor genes in 1971, using the retinoblastoma gene as an example [6]. 1.2 Epigenetic regulation of gene expression 1.2.1 Defining epigenetics Epigenetics is defined as stable changes in gene expression inherited through subsequent cell divisions, which is not due to a change in DNA sequence [7]. The term “epigenetics” was first introduced by C.H Waddington in 1942 to describe “the casual interactions between genes and their products, which bring the phenotype into being” [8]. However, recent publications indicate that experiments by Paul Kammerer, performed already in the early 1900s, revealed an epigenetic mechanism [9,10]. Epigenetic inheritance includes DNA methylation, histone modifications and RNA-mediated silencing [11]. DNA methylation is the bestknown epigenetic marker, and global hypomethylation was the first epigenetic abnormality to be identified in cancer cells [12]. The molecular role of DNA methylation DNA methylation affects the packing of chromatin and the overall architecture of the nucleus, and has thereby a critical role in the control of gene expression [13]. The chemical modification, represented by methyl groups, occurs on cytosines (C) that are located upstream of guanines (G); these sites are termed CpG dinucleotides. In vertebrates, the genome is depleted of CpG dinucleotides as a consequence of spontaneous deamination of 5-methylcytosines to thymines (T), leading to C-T transition mutations [14]. The remaining CpG dinucleotides are unequally distributed throughout the genome and are especially common in centromeric regions as well as in repetitive sequences. Stretches of sequences containing the theoretically expected frequency of CpG dinucleotides are frequently located in the 5´ end of the promoter regions. These stretches are called “CpG islands” [15] and 13 Introduction are present in approximately half of all human genes [16] According to Takai and Jones, a CpG island is defined as a region of minimum 200 base pairs, with a GCcontent higher than 55% and an observed to expected ratio of CpG higher than 0.65 [17]. Cytosine methylation is catalysed by DNA methyltransferases, and generated by an enzymatic transfer of a methyl group from the universal methyl donor Sadenosylmethionine to the carbon-5 position of cytosine (Figure 2). Maintenance and restoration of methylation patterns of hemi-methylated strands after DNA replication is carried out by DNA methyltransferase 1 (DNMT1) [18]. The enzymes function is mainly to sustain the methylation patterns in proliferating cells. DNMT3A and DNMT3B, on the other hand, are required to initiate de novo methylation, and thereby establish new DNA methylation patterns [19]. Figure 2. The process of DNA methylation. Methylation of cytosine is an epigenetic mechanism which provides an extra layer of transcriptional control. The cytosine is methylated in the carbon-5 position by the action of DNA methyltransferases. The role of DNA methylation in cancer Global genomic hypomethylation in tumours Hypomethylation is defined as a decrease in the level of DNA methylation at CpG sites in a given sample, in comparison to normal tissue [20]. In 1982, Feinberg and Vogelstein found that a considerable number of CpGs methylated in normal tissue, were unmethylated in cancer cells [12]. Further research confirmed that the global level of 5-methylcytosine was reduced in all tumour types, both benign and malignant, compared to their normal counterparts [21]. The overall decrease in DNA methylation is mainly due to hypomethylation of repetitive sequences, and the 14 Introduction degree of hypomethylation increases during the neoplastic development of a benign lesion to an invasive cancer. Several alternative mechanisms have been suggested to explain how DNA hypomethylation can contribute to the development of a cancer cell (Figure 3). First, methylation of repetitive sequences has been thought to impair their ability to promote chromosomal rearrangements. Undermethylation of these regions can favour mitotic recombination. This theory has been confirmed in experiments where loss of DNMT and the resulting DNA hypomethylation led to chromosomal instability in human cancer cells [22]. In addition, intragenomic parasitic DNA sequences, such as the L1 (long interspersed nuclear element), are inactivated by methylation in normal cells. Demethylation of these transposons may lead to translocation, insertions, and deletions, further disrupting the genome [23]. A study of genomic hypomethylation in colorectal neoplasia showed that global hypomethylation occurs early in the tumour development and that the frequency or extent of hypomethylation was independent of tumour subtype [24] Second, hypomethylation of DNA may lead to inappropriate gene activation by demethylating CpG islands located in the promoter of proto-oncogenes. This leads to expression of normally inactivated genes, such as the HRAS oncogene [25]. Testis-cancer antigens (TCA) are also reactivated by this mechanism. These genes are generally methylated and not expressed in normal tissue, with the exception of the testis. Due to hypomethylation, protein expression of this gene family can be used as antigens in cancer cells. Finally, loss of methyl groups may disrupt genomic imprinting and consequently increase risk of cancers (See section below). 15 Introduction Figure 3. The effect of epigenetic alterations in cancer. The figure illustrates the increasing difference in methylation patterns and histone modifications during tumourigenesis. Both mechanisms play an important role in normal development and loss of epigenetic control in cancer cells can affect closely regulated mechanisms, including genomic imprinting, x-chromosome inactivation and silencing of repetitive sequences. Gene specific DNA hypermethylation The retinoblastoma (RB) gene was not only the first gene to be characterised as a tumour-suppressor, but was also the first gene to be identified as hypermethylated. In 1994, Horsthemke et al documented the connection between reduced expression of the RB gene in tumours and hypermethylation of the promoter [26]. Hypermethylation of CpG islands located in promoter regions of genes is a major event in the development of the majority of cancer types, due to the subsequent aberrant silencing of important tumour-suppressor genes (Figure 4) [13]. The importance of promoter hypermethylation for gene expression has been exemplified by the ability of the demethylating agens 5-aza-2´-deoxycytidine to reactivate the genes and to restore the active transcription in cultured cancer cells [16]. 16 Introduction Epigenetic inactivation may affect all of the molecular pathways involved in transformation of a cell. Promoter hypermethylation-associated gene silencing has amongst others been observed in the APC/β-catenin route (APC), DNA repair (hMLH1, MGMT), cell cycle (CDKN2A) and other pathways and processes that together constitutes the hallmarks of cancer (see previous). Some tumoursuppressor genes are hypermethylated across several cancer types. However, several genes are tumour specifically methylated constituting a distinct DNA methylation profile for each cancer type. It is likely that some of the epigenetic events are contributing to the neoplastic process (drivers), while others are mere passengers. When comparing the hypermethylation profiles of distinct cancer types, the general observation is that tumours arising in the gastrointestinal tract, such as the colon, rectum, and gastric share a set of genes that undergo hypermethylation, while tumours arising in other tissues, such as the lung, neck, and/or head have a different “hypermethylome” pattern [27]. Figure 4. Normal versus cancer epigenome. In normal mammalian cells, CpG islands located in proximal gene promoter regions are protected from DNA methylation (cytosines shown as open lollipops) and reside in an open chromatin conformation or euchromatin regions, where transcription of the downstream gene is constitutive. However, in cancer cells, the CpG islands in the promoter tend to be hypermetylated (cytosines shown as black lollipops) and reside in a closed chromatin conformation, or heterochromatin. This condition does not favour transcription, thus the gene is in an inactive state. Modified after Baylin and Herman, 2003 [16]. 17 Introduction DNA methylation in normal development DNA methylation is one of the mechanisms that control the normal development of a fertilized egg into an embryo, as well as maintaining the normal pattern of gene expression in all cells after the embryo is fully developed. In most mammalian genomes, CpG islands are generally unmethylated in normal tissue except for a few, well-known exceptions such as genomic imprinting, X-chromosome inactivation and intragenomic repetitive and parasitic sequences (Figure 3). These cases will be discussed briefly in the following. Genomic imprinting Genomic imprinting refers to genes that are differentially expressed depending on whether they are inherited from the maternal or paternal genome. The imprints are established during the development of germ cells into either sperm or eggs. These imprints are maintained during fertilisation, as the chromosomes duplicate and segregate. The germ cells of the new organism will erase all of the imprinted marks, and a new pattern will be established, one that is maintained and modified in all somatic cells during development [28]. The role for DNA methylation in maintaining allele-specific expression was put forward by Li and colleagues in 1993 [29], where they demonstrated that methylation patterns may be inherited in a parent-of-origin specific manner. Imprinted genes have further been shown to be associated with CpG islands [30], and local epigenetic modifications of these sites as well as of the surrounding histone tails protruding from the chromatin contribute in determining differential gene expression. Loss of imprinting (LOI) of the insulin-like growth factor II gene (IGF2) has been discovered in embryonic tumours, such as Wilms tumour [31]. LOI of IGF2 has been linked to hypermethylation or a deletion in the differentially methylated region localised upstream to the reciprocally imprinted paternal H19 allele, leading to biallelic expression [21,32]. H19 and IGF2 are expressed in a monoallelic fashion from the maternal and paternal chromosomes, respectively. 18 Introduction Conversely, a study carried out by Feinberg et al showed that loss of IGF2 imprinting is associated with an increased risk of developing colorectal cancer [31]. X-chromosome inactivation X-chromosome inactivation is a mechanism that equalises the gene dosage between females (harbouring two X chromosomes; XX) and males (harbouring one X chromosome; XY) by inactivating one of the X-chromosomes in female cells [33]. One of the two copies will be subjected to genetic and epigenetic regulatory mechanisms, leading to transcriptional silencing and inactivation. The inactivated X chromosome expresses the XIST (X-inactive specific transcript) gene, a nuclear non-coding RNA that originates in the X-inactivation centre. XIST coats the chromosome, which subsequently forms a condensed and clearly visible heterochromatin-like structure called the Barr body [34]. Berg et al showed that once the XIST RNA has covered the chromosome, epigenetic modifications leading to X-chromosome inactivation are induced. The epigenetic mechanisms include hypoacetylation of lysine 9 at histone H3 (H3K9ac) and trimethylation of lysine 27 at histone H3 (H3K25Me3) [33]. The inactivated state of the Xi chromosome is clonally inherited through several rounds of cell division. Methylation has long been though to be a key mechanism in maintaining the inactivated state of the X-chromosome. Mohandas et al found that treatment with the demethylating 5-azacytidine reactivated several genes on the Xi chromosome [35]. The importance of CpG methylation in stabilising and maintaining Xi is further demonstrated in patients with ICF (immunodeficiency centromeric instability facial anomalies). The syndrome is caused by a defect in DNA methyltransferase DNMT3B [36], and as a consequence, several CpG islands analysed in the Xi of these patients are hypomethylated. It is proposed that the escape of Xi is limited to candidate genes which are involved in pathways that are altered in ICF [37]. 19 Introduction Silencing of parasitic and repetitive sequences Repetitive elements constitute approximately 45% of the human genome and loss of methylation of these sequences is thought to account for most of the global hypomethylation observed in close to all human cancer types [38]. This phenomenon is probably related to the fact that most mammalian repeat elements are maintained in the heavily methylated and inactivated state in normal somatic tissues [39]. While methylation inactivates the transcription and movement of the mobile repeat elements, demethylation may result in transcriptional interference and dysregulation of normal gene expression, leading to destabilisation and chromosomal translocations. Loss of heterozygosity (LOH) could occur as a result of the rearrangements, and LOH is a strong driver for the neoplastic process of a cell into a malignant tumour. A study carried out by Ward et al showed that hypomethylation of the retrotransposon L1 (LINE-1) was a frequent feature of colorectal cancer, possibly participating in chromosomal instability [23]. 1.2.2 The interplay of epigenetic regulators To understand the overall role of epigenetic alterations on gene activity, the molecular role of histone modification in organising and maintaining the chromatin structure must also be mentioned. Positioning of the nucleosomes together with different modifications of the histone tails protruding from the nucleosome modulates the normal epigenome in terms of maintaining normal gene expression and chromosome structure and function [40]. The balance of the histone modifications in conjunction with DNA methylation is strictly regulated and even slight changes may alter the transcriptional state of a gene. Histone modification and chromatin remodelling The underlying unit of the chromatin, the nucleosome, has the same type of design in all eukaryotic cells, and consists of approximately 147 base pairs DNA wrapped around an octamer of core histone proteins, including H2A, H2B, H3 and H4 [41]. Each of the histones is present in duplicates in the octamer. Covalent modification 20 Introduction of the N-terminal of certain amino acids on the histone tails has been proposed to create a histone code, in which the sum of different modifications determines whether the chromatin exists in an open and actively transcribed state (euchromatin) or in an inactive, closed state (heterochromatin). Among the post-translational modifications that constitute the code are methylation, acetylation, phosphorylation, ubiquitination and sumoylation. Acetylation of histone lysines is carried out by histone acetylases (HATs) and is generally associated with active transcription. Methylation of the residues, distributed by histone methyltransferases (HMTs), is associated with both active or inactive state, depending upon the modified residue (lysine or arginine) and modification site. This is exemplified by methylation of lysine four on histone H3, which is associated with transcription, whereas methylation of lysine nine on the same histone tail is associated with repression [16]. Chromatin remodelling, the general process of inducing changes in chromatin structures, consists of mechanisms that provide energy to displace nucleosomes. By relocating the nucleosomes at the promoter of a gene, the accessibility of transcription factors to DNA is increased. The SWI/SNF is an ATP-dependent remodelling complex that uses ATP hydrolysis to increase the accessibility to nucleosomal DNA, which is an essential requirement to activate transcription [42]. Candidate players for epigenetic regulation of gene expression There is a tight interdependence between DNA methylation, histone modification and chromatin remodelling in packaging DNA and determining how available the regulatory regions are for transcription factors. Methylated CpG dinucleotides in promoters attract proteins that “read” the methylation pattern, and repress the transcription of the downstream gene [18]. These proteins contain a methyl cytosine-binding domain which recognises and binds to methylated CpG sites, and often recruits histone modifying and chromatin remodelling complexes. One such methyl binding domain (MBD) containing protein is MeCP2. MeCP2 has been shown to form a complex with HDACs (histone deacetylases) and the co-repressor 21 Introduction SIN3, leading to transcriptional repression in a methylation-dependent matter. In addition, MBD containing proteins may also interact with histone methyltransferases [18]. Harikrishnan et al showed that Brahma (Brm), which is a catalytic subunit of the SWI/SNF complex, associates with MeCP2, providing a potential link between DNA methylation and chromatin silencing [43]. All of these finding suggests that DNA methylation, histone modification and chromatinremodelling are connected (Figure 5). Figure 5. Interaction between DNA methylation, histone modification and chromatin remodelling can silence gene transcription. Methylated DNA recruits proteins with methyl-CpG binding domains (MBDs) such as MeCP2. MeCP2 usually occurs in a complex with histone deacetylases (HDACs) and co-repressors, such as SIN3. HDACs removes acetylation from histone tails protruding from the chromatin, contributing to gene inactivation. The inactive state of the gene may be sustained by methylation of lysine 9 at histone H3, performed by histone methyltransferases (HMTs). Chromatin remodelling complexes, such as SWI/SNF, can be recruited by MeCP2, leading to rearrangement of the chromatin structure into a more densely packed conformation. Modified after Wang, 2005 [44]. 22 Introduction With the exception of DNA methylation, the establishment and maintenance of modification patterns after replication is not well understood. It has, however, been stated that DNA methylation patterns and various modifications of histone tails have a mutually dependent biological relationship. The relationship might work both ways. Histone modifications may direct DNA methylation, while DNA methylation, on the other hand, might serve as template for diverse histone modifications. Bergman et al postulated that mono-, di- and trimethylation of lysine 4 at histone three (H3K4) precede de novo methylation before implantation of the embryo. At the time of implantation, DNMT3A and DNMT3B will be expressed and directed to the methylated H3K4 residues, where it will perform de novo methylation of nearby CpG sites. 1.2.3 RNA-mediated gene silencing Small, endogenous RNA molecules, named microRNAs (miRNAs) and small interfering RNAs (siRNAs), regulate the stability and translation of target messenger RNAs (mRNAs). In recent years, altered expression of miRNAs in various cancers due to epigenetic regulation has been shown to be an additional hallmark of carcinogenesis. This will be further described in the following. Micro-RNA MicroRNAs (miRNAs) are small, single-stranded, non-coding RNAs of approximately 22 nucleotides (nt) that negatively regulate gene expression in eukaryotic cells through translational inhibition or degradation of messenger RNA [45]. Most miRNAs are derived from primary miRNA transcripts produced by RNA polymerase II, and cleaved into a 70 nt long pre-miRNA by a multiprotein complex in the nucleus. The complex is transported back into the cytoplasm and cleaved into a mature, 22 nt miRNA, which is subsequently incorporated into a RNA-induced silencing complex (RISC). RISC selectively guides this complex to and targets 3´ untranslated region of specific mRNAs [46]. miRNAs downregulate the expression of target genes depending on the level of complimentarity to the target mRNA. 23 Introduction Perfect or near-perfect complimentarity induces mRNA degradation, whereas imperfect binding results in translational inhibition. miRNAs have recently been shown to participate in the control of numerous cellular processes, such as apoptosis, differentiation, proliferation and development. miRNAs are either downregulated or upregulated in carcinomas when compared to normal tissue, resembling the actions of the inactivation of tumour-suppressor genes or activation of proto-oncogenes during carcinogenesis. Hence, miRNAs can serve as both tumour-suppressor genes and oncogenes, and disruptions of the miRNAs involved in maintaining the normal activity of the cell may contribute to the formation of tumours [47]. Epigenetic dysregulation of microRNAs in cancer Epigenetic profiling of miRNAs has revealed new insights into the altered epigenetic regulation of these molecules in diseases, including cancer. In recent years, silencing of miRNA gene expression due to hypermethylation of associated CpG islands has lead to speculations regarding whether miRNAs could potentially be utilised as diagnostic and prognostic biomarkers. Downregulation of miRNAs and potential histone modifications and subsequent transcriptional inactivation is now a widely accepted feature of several cancer types [45]. MiR-34b and miR-34c, two components of the TP53 pathway, are epigenetically inactivated in colorectal cancer, and treatment with the demethylating chemical 5-aza-2´-deoxycytidine has been shown to restore their expression [48]. MiR-143 has been shown to target DNMT3A in colorectal cancer, leading to reduced growth of colon cancer cells [49]. Among the oncogenic miRNAs, miR-21 and miR-31 has been shown to be upregulated in colorectal cancer. Upregulation of miR-21 has been discovered in the advanced stages, indicating that it may play an important role in invasiveness of the cancer. Furthermore, mir-21 has been suggested to post-transcriptionally downregulate the translation of genes involved in suppressing tumor progression. MiR-31 is another potential micro-oncogene in colorectal tumours. Interestingly, members of the Wnt signalling pathway are among the target genes of miR-31 [50]. 24 Introduction 1.2.4 Epigenetics: Nature or nurture or both? Epigenetic events play a fundamental role in normal physiological responses to environmental stimuli, which affects the epigenome and subsequently directs alterations in the epigenetic state of the genome. Dietary and environmental substances have been shown to affect the epigenetic patterns, leading to changes that accumulate over a longer period of time. Factors that induce such changes are termed “epigenetic carcinogens” (epimutagens) [51]. Chemical and physical epimutagens, such as tobacco smoke and irradiation, are examples of factors that contribute to development of cancer by inducing epigenetic, as well as genetic, changes. In terms of nutrition and diet, deficiency in folate and methionine, which are involved in the processes that supply methyl groups for DNA methylation, may affect the level of methylated CpG sites [52]. Accordingly, epidemiologic research suggests that diets providing higher levels of folate may reduce the risk of developing colorectal cancer. However, large doses of folic acid have been shown to lead to unmetabolised folic acid in the peripheral blood, and subsequent reduction in natural killer cells [53]. In addition, intake of folic acid supplementation in early pregnancy has been associated with epigenetic alterations of IGF2 in the child, which may affect the growth, development and health of the child [54]. The risk of having cancer increases with age, probably because cells progressively accumulate enough errors to evade the homeostatic control mechanisms that govern normal cell behaviour and tissue contexts. A recently published article by Kelsey et al showed that CpG-island loci in a wide range of normal tissues gain methylation with age, and they hypothesise that the reduced fidelity of maintenance methyltransferases with aging could be one potential explanation for this phenomenon [55]. Several twin studies on epigenetic profiles have investigated to which extent age, environment and lifestyle can impact gene expression [56]. Fraga and co-workers were the first to use monozygotic twins for this purpose and found that twins who had spent less of their lives together, and had different natural 25 Introduction health-medical history, were those with the greatest epigenetic differences [57]. In addition, the older twin pairs included in the study seemed to have the greatest epigenetic differences. Taken together, these findings demonstrate that age, diet, lifestyle and environmental factors, affect the epigenetic pattern in individuals with identical genetic make-up. Thus, distinct epigenetic profiles may contribute to explain the phenotypic differences and different disease susceptibility among monozygotic twins. 1.3 Colorectal cancer Colorectal cancer (CRC) is the third most commonly diagnosed cancer among men and women, with approximately 1 million new cases each year world-wide [58]. The incidence in Norway is approximately 3500 [59]. An almost equal number of men and women develop CRC, indicating that the cancer is gender independent. The risk of developing CRC increases with age, and the disease affects primarily older individuals (median age 70 years). The significant difference in CRC occurrence between industrialised and developing countries emphasize the importance of lifestyle and environmental factors in CRC development. Dietary risk factors, including red and processed meat, as well as alcohol consumption, tobacco, diabetes, obesity and physical inactivity are associated with greater risk of developing CRC [60]. While the majority of CRCs are sporadic, a small group (~5%) arises as a consequence of defects in single hereditary components, such as hereditary nonpolyposis colorectal cancer (HNPCC) and familial adenomatous polyposis (FAP). HNPCC, also known as Lynch syndrome, is an autosomal dominant disease, which is caused by a germ line mutation in one of the DNA mismatch repair (MMR) genes [61]. Consequently, DNA replication errors occur at a higher frequency in repetitive sequences, know as microsatellites, leading to microsatellite instability. HNPCC counts for 1-5% of colorectal cancers, and is associated with an 80% lifetime risk 26 Introduction for developing colorectal cancer [62]. HNPCC is associated with a better prognosis when compared to sporadic cancers. FAP is an autosomal dominant disease which accounts for less than 1% of all CRC cases, and is associated with nearly 100% lifetime risk of developing colorectal cancer, unless the colon is removed by surgery [61]. The disease is characterised by the presence of more than 100 adenomatous polyps, and the number of polyps increases with age. Truncating mutations in the Adenomatous polyposis coli (APC) tumour-suppressor gene has been shown to be the cause of most FAP cases. The APC gene participates in the Wnt signalling pathway, where it is involved in the degradation of β-catenin (CTNNB1). Mutations in APC affect the ability to maintain normal growth, subsequently leading to uncontrolled overgrowth of cells. 1.3.1 Molecular developmental pathways Two distinct colorectal cancer pathways are suggested to explain the step-wise process from benign neoplasms to adenocarcinoma, the chromosomal instability (CIN) pathway and the microsatellite instability (MSI) pathway. Chromosomal instability is the most common type of genomic instability and accounts for more than 80% of colorectal carcinomas. Chromosomal instability occurs mainly as a consequence of either missegregation of normal chromosomes or structural rearrangements [63]. Inactivation of proteins that regulate the mitotic spindle checkpoints, DNA damage checkpoints, chromosome metabolism, centrosome function and DNA replication has been hypothesized to potentially cause CIN in cancer [64]. Aneuploidy1 serves as a hallmark of chromosomal instability [64], and tumours that exhibit CIN phenotype are most often located at the distal, or the left, side of the colon. Furthermore, the CIN molecular pathway is associated with better prognosis for the patient when compared to MSI. 27 Introduction Microsatellite instability occurs in ~15% of sporadic colorectal carcinomas. Microsatellites are stretches of DNA where a short pattern of 1-6 bases is repeated several times, and these motifs are spread throughout the genome. HNPCC is, as mentioned previously, caused by either a germline mutation in the DNA mismatch repair genes MLH1 or MSH2, while MSI in sporadic cancers are primarily caused by biallelic hypermethylation and subsequent inactivation of the mismatch repair gene MLH1 [65]. These genetic and epigenetic changes will give rise to a defective DNA mismatch repair system, leading to an inability to repair base-base mismatches and small insertions and deletions. This will in turn cause an increased genomic mutation rate, which is why MSI has been referred to as the mutator phenotype [62]. The defective repair system may ultimately facilitate malignant transformation by allowing rapid accumulation of alterations in genes that ordinarily have key functions in the cell. Sporadic MSI colorectal carcinomas have distinct clinicopathological features, such as location in the proximal colon, diploid or near-diploid karyotype, association with the female gender and poor differentiation [66]. In addition to these molecular phenotypes, an alternative third molecular route for colon cancer development was suggested by Toyoto et al in 1999. The CpG Island Methylator Phenotype (CIMP) is based on the identification of a subset of colorectal cancers with concordant hypermethylation of several CpG loci. Toyoto and co-workers showed that CIMP positive tumours include the majority of sporadic colorectal cancers with MSI related to MLH1 hypermethylation [67]. Furthermore, CIMP has been showed to be strongly associated with BRAF mutation [68]. Hierarchal clustering has identified three groups with distinct genetic and epigenetic profiles (Figure 6). CIMP negative tumours display rare methylation and TP53 mutation. CIMP1 tumours are methylated at multiple loci and display MSI and BRAF mutations, while CIMP2 is methylated at a limited number of age-related 1 Aneuploidy - having or being a chromosome number that is not an exact multiple of the usually haploid number. 28 Introduction genes and display mutations in KRAS [68,69]. CIMP tumours are characterised by many of the features typical of MSI tumours, such as proximal location, BRAF mutation, poor differentiation, female gender and old age. Figure 6. Integrated genetic and epigenetic analysis identifies three different subclasses of CIMP. Shen and co-workers utilised integrated analysis of the mutation status of BRAF, KRAS and TP53, as well as MSI status and methylation status of a panel of genes, to identify three distinct subgroups of CIMP (from [69]). 1.3.2 Histopathology and morphological pathways The transition from normal epithelium to carcinoma in human colon can be arrayed into a series of increasing abnormality. Several lines of evidence indicate that adenomas can develop into carcinomas; however, this does not imply that all adenomatous polyps will undergo malignant transformation. Aberrant crypt foci (ACF) represent one of the earliest steps in the development of colorectal cancer, followed by other morphological precursors. Each of these pre-malignant outgrowths differs in size, level of dysplasia and villous complexity [70]. Colorectal polyps are associated with different genetic and epigenetic alterations when compared to normal tissue, thus, an accurate description of the phenotype is important to achieve correct phenotypic and (epi)genotypic classification. The adenoma-carcinoma sequence refers to the evolution of normal epithelia cells to adenocarcinomas as a progression of histological changes and concurrent genetic and epigenetic alterations. According to this model, adenomas can develop into either MSI or CIN tumours, depending on the alterations [71]. The concept that 29 Introduction inactivation of the APC tumour-suppressor gene initiaties colorectal neoplasia has been shown to be an oversimplification. Research findings support the theory that adenomas only gives rise to CIN and CIMP negative tumours, whereas the sessile serrated adenomas, a subgroup of the hyperplastic polyps, has emerged as precursors to MSI and CIMP tumours (Figure 7). Mutation in the BRAF protooncogene is considered to be the “gatekeeper”2 in this pathway [72-74]. Figure 7. Molecular developmental pathways in colorectal cancer. Colorectal cancer is thought to develop through two molecular and morphological distinct pathways, the sessile serrated pathway giving rise to CIMP positive, MSI-tumours (red), and the chromosomal instability pathway giving rise to CIN-tumours (blue). The histological diverse steps are associated with distinct genetic (bold) and epigenetic (bold, italic) alterations. The “gatekeeper” gene APC has long been known to initiate the adenomacarcinoma sequence, whereas BRAF is though to initiate the sessile serrated pathway. 2 A class of genes which directly regulate tumor growth by inhibiting growth or by promoting cell death. 30 Introduction 1.3.3 Tumour classification, treatment and outcome Cuthbert E. Dukes proposed the original staging system for tumours of the colon and the rectum in 1932 [75], and a modified version is today used to divide colon and rectal cancer into four Dukes´ stages. Dukes´ A tumours are confined to the intestinal mucosa and submucosa, whereas Dukes´ B tumours have grown through these layers, and into the muscle layers of the bowel wall. Dukes´ C tumours have spread to the regional lymph nodes, and Dukes´ D tumours have distant metastasis in other organs of the body (Figure 8). The TNM staging system is also widely used. TNM stands for “tumour, node, metastasis”, and the model describes the size of the primary tumour (T), whether any lymph nodes contain cancer cells (N) and whether the cancer has spread to another part of the body (M). Figure 8. Carcinogenesis of colorectal cancer and Dukes´ classification. The figure illustrates the steps in which a benign precursor, adenoma, develops into a malignant polyp. The polyp may progress into a tumour, and according to the level of penetration of the bowel wall, may be classified as either Dukes´ A, B, C or D tumour. In Dukes´ A colorectal cancer, the tumour has only affected the innermost lining of the colon (mucosa and submucosa), whereas Dukes´ B cancer has grown through the muscle layers. Dukes´ C cancer has spread to at least one local lymph node, while Dukes´ D cancer has metastasised to distant organs. Survival among CRC patients depends on the tumour stage at the time of diagnosis. Patients diagnosed with localised tumours (Dukes´ A and B) have a five-year 31 Introduction survival close to 90%, while patients diagnosed with spread to regional lymph nodes (Dukes´ C) have a five- year survival of 65%. Patients diagnosed with distant metastasis (Dukes´ D) have the worst prognosis, with a five-year survival of only 10% [59]. Colorectal cancer patients in Norway are today treated with surgery and/or adjuvant chemotherapy3, depending on the tumour stage at diagnosis. Patients with localised tumours (Dukes´ A and B), receive surgery alone. Adjuvant chemotherapy (surgery in combination with chemotherapy) is given to patients with Dukes´ C cancer. Additionally, some Dukes´ B patients receive adjuvant treatment when an inadequate amount of lymph nodes are analysed for presence of cancer cells. The most common regime is 5-fluoruracil/leukovorin (calsiumfolinate) in combination with other drugs, such as oxaliplatin. 1.4 Clinical relevance of molecular biomarkers The identification of molecular biomarkers has been the focus of extensive research where the ultimate goal is to discover markers with a diagnostic and/or therapeutic value. Molecular biomarkers are defined as indicators of normal biological processes, pathogenic processes or pharmacologic responses to therapeutic intervention, and can be DNA, RNA or protein based, which is exemplified in the following. Epigenetic changes, including DNA hypermethylation, are potentially good indicators of existing diseases. DNA hypermethylation of ADAMTS1, CRABP1 and NR3C1 has been found in 71%, 49% and 25% of colorectal carcinomas [76]. Interestingly, epigenetic gene regulation can also be utilised to predict response to treatment. Alkylating agents are frequently used as treatment against malignant brain tumours. Methylation of the MGMT promoter has been shown to predict the response of brain tumours to alkylating agents [77,78]. 3 From the web-page of the Norwegian Gastro Intestinal Cancer Group: http://ngicg.no/wp/ 32 Introduction RNA expression analysis has identified distinct molecular subtypes of breast cancer, predicted response to neoadjuvant therapy (treatment before primary surgery) and discovered gene-expression signatures that distinguish primary tumours from metastatic adenocarcinoma [79-81]. Furthermore, alternative splicing of RNA molecules can generate cancer-specific splice variants which may serve as diagnostic disease biomarkers. Through alternative splicing, a single gene is able to generate several transcript variants from one type of precursor messenger RNA (pre-mRNA), which may produce different protein isoforms. CD44 and WT1 have been characterised as cancer-related genes that undergo extensive alternative splicing [82]. Splice variants that are overexpressed in cancer are usually expressed as hyperoncogenic proteins, which often correlate with poor prognosis [83]. Conversely, cancer therapy directed to correct malfunctioning splicing machinery can prevent production of oncogenic, mRNA splice variants [84]. Proteins that are produced in increased amounts can serve as biomarkers to detect specific diseases, exemplified by GOLM1 which has been found to be upregulated in urine of patients with prostate cancer, suggesting that GOLM1 levels in urine can serve as a predictor of prostate cancer [85]. Unlike genomic measurements, which generally require biopsy, blood provides important insight into the presence and activity of proteins in diseases. However, protein biomarker research is complicated by the difficulty of identifying medium or low abundance of proteins in the plasma. Many biomarkers with clinical value have concentrations five to seven orders of magnitude lower in abundance than the most highly concentrated plasma proteins. It is possible to overcome this obstacle if the biomarkers arise locally (e.g. malignant tumour), by analysing fluid close to or in contact with the site of disease [86]. 33 2. Aims The overall aim of the present study was to identify novel epigenetic biomarkers with a diagnostic potential in colorectal cancer. The first objective was to identify yet undiscovered candidate target genes inactivated by promoter hypermethylation. Second, the candidate gene list identified needed to be evaluated for its cancer specificity through experimentally optimised assays in suitable normal and tumor samples. The final objective of this thesis was to evaluate the suitability of any novel target genes to be developed as biomarkers for colorectal cancer. 34 3. Materials and methods 3.1 Materials 3.1.1 Colon cancer cell lines Twenty colon cancer cell lines were analysed in this project. The cell lines included eleven microsatellite stable (MSS; ALA, Colo320, EB, FRI, HT29, IS1, IS2, IS3, LS1034, SW480, V9P) and nine microsatellite unstable (MSI; Co115, HCT15, HCT116, LoVo, LS174T, RKO, SW48, TC7, TC71) cell lines, thereby representing both of the phenotypical subgroups of colorectal cancer. 3.1.2 Tissue samples – Colorectal carcinomas and normal mucosa Forty-seven primary colorectal carcinoma samples, including 27 MSS and 20 MSI tumours, were subjected to DNA promoter methylation analysis in the present study. Twenty-four of the samples derived from a series which was collected at seven hospitals in the South-Eastern part of Norway from 1987-1989 [87]. The remaining 23 samples were collected at Aker University Hospital from 2005-2007. For detailed clinico-pathological data, see Appendix I. Also included in the present project were 49 normal colorectal mucosa samples derived from deceased colorectal cancer-free individuals (Institute of Forensic Medicine, Rikshospitalet University Hospital). These clinico-pathological data are listed in Appendix II. Genomic DNA from cell lines, primary tumours and normal tissue had previously been isolated using a standard phenol/chlorophorm extraction method [88]. 35 Materials and methods 3.2 Methylation-specific methodologies 3.2.1 Strategy to select novel DNA methylation candidate genes Genome-wide gene expression analysis In contrast to genetic changes, epigenetic modifications can be reversed. Treatment of cancer cells with the demethylating agens 5-AZA-2´-deoxycytidine (AZA) and the histon deacetylase inhibitor Trichostatin A (TSA) has been shown to reactivate tumor suppressor genes inactivated by promoter hypermethylation and histon deacetylation [89]. AZA is a cytosine analogue which is incorporated into the DNA and forms an irreversible covalent complex with DNMT1[90]. This will in turn lead to depletion of DNMT1 in the cell during DNA replication, causing a passive demethylation and subsequent reactivation of genes [91]. Histone deacetylase inhibitors, such as TSA, modulate the expression of genes by causing an increase in histone acetylation, thereby relieving the transcriptional repression of the chromatin. The AB1700 microarray platform (Applied Biosystems, Foster City, CA, USA) was utilised to analyse the gene expression of colon cancer cell lines before and after treatment with AZA and TSA, identifying novel gene targets epigenetically inactivated in colorectal tumorigenesis. The analyses were performed prior to this master thesis. Six cell lines were analysed, including three MSI (SW48, RKO, HCT15) and three MSS (SW480, LS1034, HT29) cell lines. Only genes upregulated four or more times after treatment in at least five of the six cell lines analysed were chosen for further investigation. In order to increase the likelihood of selecting true epigenetic targets, the gene expression of the same genes were analysed in primary colorectal carcinomas and normal tissue samples, using the same microarray platform. Only those genes that responded in cell lines and simultaneously were down-regulated in the carcinomas as compared to normal tissue were chosen for further analysis. The selection process for the discovery of new, hypermethylated genes is summarised in Figure 9. 36 Materials and methods Figure 9. Strategy to select novel DNA methylation candidate genes. Prior to the start of the masterproject, six cell lines and their AZA and TSA treated counterparts were analysed by the AB1700 microarray platform. Genes upregulated four or more times after treatment in at least five of six cell lines, while simultaneously being down-regulated in primary colorectal carcinomas relative to normal colon mucosa, were choosen for downstream methylation analyses. The masterproject included in silico analyses of potential targets for DNA methylation. Only genes containing one or more CpG islands in the promoter were subjected to analyses in cell lines and clinical samples. Analysing gene promoters for the presence of CpG islands Because loss of gene expression often is associated with aberrant methylation of promoter CpG islands, suitable target genes for DNA methylation analysis should contain a CpG island in their promoter region. The CpG Island searcher4 was applied to analyse the candidate genes for the presence of one or more islands. The algorithm and criteria used in the program was described by D. Takai and P.A Jones in 2002 [17]. 4 http://www.uscnorris.com/cpgislands 37 Materials and methods 3.2.2 Bisulfite modification The principle of bisulfite modification of DNA was first described in 1970 [92], but the protocols on how this application could be used for epigenetic analyses was not published until the 1990s [93,94]. The modification of DNA translates the methylation event to a genetic change, which can then be analyzed using various polymerase chain reaction-based methods. The principle behind this treatment is that the bisulfite will deaminate unmethylated cytosines (C) to uracil under conditions with low pH and high bisulfite salt concentration, while 5methylcytosines remains protected from this conversion. During PCR the uracils are replaced with thymine, while the methylated Cs remains the same. Hence, this treatment serves as a chemical modification resulting in differences in the sequences that can be used to determine the methylation pattern in subsequent methods. In this thesis, bisulfite mediated conversion was performed using the EpiTect Bisulfite Kit supplied from Qiagen (Qiagen Co., Valencia, California, USA). During the conversion procedure, sample DNA was mixed with the bisulfite solution and a DNA protection buffer. Incubation of the DNA samples in high salt concentration, high temperature, and low pH will eventually lead to fragmentation and loss of DNA. The DNA protection buffer contains an indicator which confirms that the reaction pH is suitable for complete conversion of unmethylated cytosines, while at the same time limiting the degradation of DNA. The reaction was performed in a thermo cycler (MJ Mini Personal Thermal Cycler, BIO-RAD, Hercules, CA, USA), in a program consisting of a series of denaturation and incubation steps. This ensures proper denaturation of the DNA, and the subsequent sulfonation and cytosine deamination. Following the reaction procedure is the clean-up of the bisulfite converted DNA, where the samples are desulfonated and washed. The input amount of DNA for the bisulfite conversion was 1.3 µg, and the DNA was eluted in 40 µl elution buffer with a final concentration of 32.5 µg/µl. For standardisation of the cleaning process, the Qiacube (Qiagen) automated pipette system was utilised. 38 Materials and methods A fully denatured DNA prior to bisulfite treatment is important due to the reaction being highly single strand specific. Unsuccessful denaturing may lead to incomplete conversion, and the subsequent downstream analyses may wrongly interpret unconverted, unmethylated cytosines as cytosines, producing false positive results. Several important factors contribute in ensuring successful and effective bisulfite conversion. First, the DNA must be of high quality and fully denatured. Second, correct pH and incubation temperature for the various steps are crucial to gain optimal conversion conditions. Third, because bisulfite can oxidize automatically with oxygen, a free radical should be included in the reaction mixture to minimize oxidative degradation. The rate of conversion is extremely effective, with an estimated rate of 99%. However, a conversion rate of 95%-98% is more frequent due to the varying DNA quality [95]. 3.2.3 Qualitative methylation-specific polymerase chain reaction. In 1996, Herman and colleagues introduced methylation-specific PCR (MSP) [96]. MSP is using bisulfite treated DNA as template and two primer sets with distinct specificities (Figure 10). One primer set is designed to anneal to and amplify the unmethylated sequence, and the other primer set to anneal to and amplify the methylated sequence. The sequential differences can be visualised by UV irradiation following ethidium bromide staining and gel electrophoresis. This technique can detect as little as 1 methylated allele among 1000 unmethylated alleles, making MSP among the most sensitive methylation analysis methods [97]. 39 Materials and methods Figure 10. Methylation specific polymerase chain reaction assay. The principle of qualitative methylation specific polymerase chain reaction, with specific primers annealing to either the methylated or the unmethylated fragment. Ten genes were analysed by MSP using DNA isolated from 20 colon cancer cell lines. Normal blood and human placenta treated in vitro with SssI methyltransferase were used as positive controls for the unmethylated and methylated reactions, respectively. Milli-Q water was used as a negative control. If the results from the cell line analysis indicated a high frequency of methylation, MSP was performed on colorectal carcinomas and normal tissue samples as well. The samples for each gene were scored relative to the intensity of the positive control. The samples were scored as either weakly metylated (intensity is less than the positive control) or heavily methylated (intensity is equal to or higher than the positive control). For this thesis, only tumour samples scored as heavily methylated were considered methylated, while samples scored as weakly methylated were classified as unmethylated. This ensures a conservative classification with a low number false positive. The scorings were performed independently by the author and another group member, Hilde Honne. All results were verified by another round of analysis, 40 Materials and methods and in cases with diverging results from the two rounds of MSP and/or discrepancy in the scoring by the two authors, a third run of MSP was performed. Primer design for MSP The most critical parameter defining the specificity and success of a MSP assay relies on the binding of the primers to their target sequence, and their ability to discriminate between methylated and unmethylated fragments. To ensure proper discrimination, the primers should contain as many CpG sites as possible, including one or more CpG sites on the 3´ region of the primer. Potential amplification of unconverted CpG sites may give false positives with regards to the methylation level, which is why non-CpG sites also should be included in the primer sequences. Finally, the overall choice of region amplified by the primers is also important for the methylation analysis. The aim is to amplify a representative region of the promoter where methylation most likely will have an effect on transcriptional activity. It is therefore important to select an area surrounding the transcription start site of the gene. In this thesis, all ten MSP primer sets were designed using Methyl Primer Express 1.0 (Applied Biosystems) and purchased from MedProbe (Oslo, Norway). For detailed primer information, see Appendix III. Optimisation of primer sets for MSP analysis The primer sets must be optimised with regards to magnesium concentration, annealing temperature, and annealing- and elongation time. This will ensure that the primers function optimally and amplify the correct PCR fragment. The unmethylated and the methylated fragments must be optimised separately, using bisulfite converted DNA from normal blood as positive controls for the unmethylated reaction, and human placenta DNA treated in vitro with SssI methyltransferase as positive control for the methylated reaction (see Figure 11). 41 Materials and methods Figure 11. Temperature and magnesium gradient for the IQCG methylated fragment. A range of different temperatures and MgCl2 concentrations were tested for all genes analysed. For this gene, 48° and 1.5 mM MgCl2 was selected for downstream analysis. M, 100 base pair marker, degrees are in Celsius, 1.5, 1.7 and 2.0 are mM MgCl2. Magnesium The MSPs were performed using a thermo stable enzyme polymerase (HotStar Taq DNA polymerase; Qiagen). Magnesium functions as a cofactor for the polymerase, and may increase the efficiency in which the enzyme performs it catalytic activity. However, increased amount of magnesium may lead to unspecific PCR products. To avoid this, a gradient of various magnesium concentrations (1.5, 1.7 and 2.0 mM) were tested for all primer sets. The overall result indicated that the lowest quantity of magnesium was sufficient. Consequently, 1.5mM was used as a standard in all reactions. Annealing temperature The annealing temperatures of the primers are among the key factors determining how efficient PCR amplification is. Primer sets work best at distinct temperature ranges, reflecting their ability to bind to the template within those ranges. Too high melting temperatures (Tm) may give a low PCR yield resulting from insufficient primer-template hybridization, while too low Tm may give unspecific products as a consequence of base pair mismatches. Two algorithms were used to calculate melting temperatures of the primer sets and a temperature gradient was set up for the unmethylated and methylated fragments. Reaction time and cycles Generally, the majority of the primer sets amplify adequate amount of MSP products with 30 seconds of annealing, 30 seconds of elongation and 35 cycles. 42 Materials and methods Some reactions may, however, require an increase in these parameters. This will improve the efficiency for the PCR reaction. The overall aim is to generate comparable band intensities for the methylated and unmethylated positive controls. MSP experimental assay The MSP mix consisted of ca 24 ng/µl bisulphite treated template DNA, 1 x Qiagen PCR buffer (containing 1.5 mM MgCl2), 0.8 µM of each of the primers (Medprobe), 0.2 mM of each of the four dNTPs (Amersham Biosciences, Piscataway, NJ, USA), 0.2 mM MgCl2 solution (Qiagen) for some of the methylated reactions, one unit Hotstar Taq Polymerase (Qiagen) and Milli-Q water to a total reaction volume of 25 µl. The DNA was amplified using a Robocycler Gradient 96 thermo cycler (Stratagene, La Jolla, California, USA). The cycling conditions consisted of a denaturation step at 95° for 15 minutes to activate the enzyme, followed by 35 cycles of denaturation at 95° for 30 seconds, 30 seconds of annealing at 48°-56°, 30 seconds of elongation at 72°, and a final extension step at 72° for 7 minutes. The range of annealing temperatures used was adapted to the individual melting temperatures for each primer set (see table III in the Appendix). The PCR products were mixed with five µl gel loading buffer (1 x TAE buffer and 0.1% xylen cyanol) and separated on a 2% agarose gel (400 ml 1 x TAE and 8 gr agarose; BIO-RAD) stained with ethidium bromide (a fluorescent intercalating dye). The electrophoresis was performed for 24 minutes at 200 V and the PCR products were visualized on an UV trans-illuminator (Chemidoc XRS Gel Documentation System; BIO-RAD and Gene Genius; Syngene, Cambride, UK). 3.2.4 Quantitative real-time chain reaction methylation-specific polymerase The real-time polymerase chain reaction is a technique used to measure the amount of product formed during each PCR cycle. This is in contrast to regular, qualitative PCR where the amount of end-product is measured. The quantitative MSP assays (qMSP) used in this thesis include primers and probes designed specifically to 43 Materials and methods amplify bisulfite-converted DNA. The probe has a fluorescent reporter dye attached at its 5´ end and a non-fluorescent quencher attached to its 3´ end. When the probe is intact, the proximity between the quencher and the reporter dye will result in suppression of the fluorescence emitted by the reporter. During the PCR process, the enzyme polymerase will extend the primers, cleave the probe due to its nuclease activity and release the reporter dye, which is detected as fluorescence by the realtime machine. The amount of fluorescence is directly proportional to the amount of product [98]. The Cycle threshold (Ct) value is defined as the number of cycles needed for the fluorescent signal to cross the threshold, i.e. exceed the background level. Consequently, the number of cycles required for a sample to cross the threshold can be used to measure the quantity of initial target fragment. In this thesis, serial dilutions of samples with known concentrations were made to generate a standard curve. The Ct values of the dilutions were plotted against the samples´ concentration, creating a linear relationship that was used to determine the quantity of a target sequence in an unknown sample [99]. Ct levels are inversely proportional to the amount of target sequence in the sample. The lower the Ct level, the greater amount of target sequence is present in the sample. High Ct values indicate a minimal amount of target sequence and could represent a possible contamination. We determined a cut-off at cycle 35, all PCR products equal to or above this Ct were censored. Methylation-specific quantification assay For real-time PCR-based quantification, primers and probes were designed manually using Primer Express Software 3.0 (Applied Biosystems). Probes were labelled with 6-FAM and a minor groove binder non-fluorescent quencher. The PCR was carried out in a reaction volume of 20 µl in 384 well plates, using the 7900HT Fast Real-Time PCR machine (Applied Biosystems). The final reaction mixture consisted of 0.9 µM of each primer (Medprobe), 0.2 µM probes (Applied Biosystems), 1 x Taqman Universal PCR Mastermix (No AmpErase UNG; Applied 44 Materials and methods Biosystems), and 30 ng/µl of bisulphite-treated template DNA. Thermal cycling was initiated with a denaturising step of 95° for 10 minutes. The amplification protocol was 45 cycles of 95° for 15 seconds and 60° for 60 seconds. Each plate included several water blanks as non-template controls, normal blood as unmethylated control and in vitro methylated DNA as methylated control. The samples were run in triplicates in 384-well plates and the median value was used for data analysis. A standard curve was generated from 1:5 serial dilutions using bisulfite-converted commercially available methylated DNA (CpGenome Universal Methylated DNA; Millipore Billerica, MA, USA). The same methylated DNA sample was used as a positive control for the qMSP reactions. Alu repetitive element was utilised as an internal reference to normalise for input DNA. The data was calculated as percent of methylated reference (PMR) values. The median GENE: ALU ratio of a sample was divided by the median GENE: ALU ratio of the positive control and multiplied by 100. Two important factors should be taken into consideration when determining the threshold value from quantitative analyses. Increasing the threshold value will give a higher sensitivity, while the specificity will decrease. A lower threshold value will increase the specificity, but reduce the sensitivity. Because the majority of the carcinoma samples analysed in the current project had higher PMR values than the normal mucosa samples, we chose to set a threshold value which would give the highest specificity. Consequently, the threshold values to score samples as methylated were set according to the PMR values for the normal mucosa samples. The highest PMR value from the normal mucosa samples for PPP1R14A was PMR = 3.4.Consequently, the threshold was set at PMR = 3.5 for this gene. All samples with a PMR value above this threshold were scored positive for methylation. The corresponding threshold for GLDC was PMR = 2.5. PCR products resulting from qualitative and quantitative MSPs were subjected to DNA sequencing to confirm that the correct fragments had been amplified. 45 Materials and methods All qMSP primer sets and probes were designed using Primer Express Software v3.0 (Applied Biosystems). The primers were purchased from MedProbe whereas the probes were purchased from Applied Biosystems. For detailed information about the primers and probes used for quantitative analyses, see Appendix IV. 3.2.5 Capillary electrophoresis sequencing In 1977, Atkinson et al demonstrated that the attachment of a dideoxynucleotide (ddNTP) in the place of a deoxyribonucleic acid in a growing oligonucleotide chain had an inhibitory effect on DNA synthesis. The dideoxynucleotides lack a 3´hydroxylgroup, thus the chain cannot be extended further after insertion and the synthesis is terminated [100]. Based on this discovery, Sanger et al developed a new system for DNA sequencing in the mid 1970s [101], and this method has been succeeded by several new sequencing techniques, such as next generation sequencing. PCR product purification To remove excess primers and dNTPs prior to sequencing, the MSP and qMSP products were purified using EXOSAP-IT (GE HEALTHCARE, USB Corporation, Ohio, USA), which contains the hydrolytic enzymes exonuclease I and shrimp alkaline phosphatase. One point five µl ExoSAP-IT was added to 10 µl of PCR product and the reaction was incubated at 37° for 15 minutes to perform the treatment, followed by an inactivation step at 80° for 15 minutes. The purification was conducted on an Eppendorf Mastercycler Gradient PCR machine. Sequencing reaction The sequencing reaction mix consisted of 0.25 µl forward or reverse primer, 2 µl BigDye Terminator v1.1 Cycle Sequencing Kit (Applied Biosystems), 2 µl 5 x Big Dye Terminator v1.1 Sequencing Buffer (Applied Biosystems), 2 µl purified PCR product, and MilliQ-water to a total reaction volume of 10 µl. Each of the four dideoxynucleotides present in the sequencing kit is labelled with fluorescent dyes 46 Materials and methods that emit fluorescence at different wavelengths. When one of the four ddNTPs is incorporated instead of the dNTP, the synthesis will be terminated, giving fragments with different lengths. The DNA bases will be excited by a laser beam, which causes them to fluoresce, thus visualising the DNA. The sequencing reaction was performed on a Robocycler Gradient 96 thermo cycler (Stratagene) and the thermal cycling conditions involved the following steps: an initial denaturation at 96° for 2 minutes, followed by 25 cycles of denaturation at 96° for 15 seconds, annealing at 50° for 5 seconds, elongation at 60° for 4 minutes and a final extension step at 6°. Purification of sequencing products A gel filtration method based on Sephadex G-50 Superfine (Amersham Biosciences), was utilized to remove excess ddNTPs and primers prior to electrophoresis and laser exposure. The method is based on differential separation of molecules. The molecules will pass through a porous gel matrix, with the speed of diffusion depending on their size. Smaller molecules, like ddNTPs and primers, will diffuse further into the pores of the gel, and will therefore move relatively slowly and be retained in the column. Larger molecules, like the sequence products, will either enter less into the pores or not enter at all, and will therefore move more quickly through the gel and be eluted with the buffer. Sephadex powder was poured onto a 96-well Multiscreen HV plate (Millipore) and 300 µl MilliQ-water was added to each well. The plate was left at room temperature for at least 2 hours, allowing the Sephadex to swell. The Sephadex plate was placed on a 96 well Optical Reaction Plate (Applied Biosystems) and centrifuged at 910 rpm for 5 minutes. To rinse the columns, 150 µl MilliQ-water was added to the wells and the plate was once again centrifuged at 910 rpm for five 5 minutes. The Sephadex plate was then placed on a new 96 well Optical Reaction Plate (Applied Biosystems), 10 µl MilliQ-water and 10 µl product were added, and a final centrifugation at 910rpm for 6 minutes was conducted. 47 Materials and methods ABI PRISM 3730 Sequencer was used to separate the oligonucletide fragments according to their sizes. The laser beam exits the fluorescently labelled ddNTPs which then emits lights of different wavelengths. The software interprets the fluorescence data and visualises the results as electropherograms. All electropherograms were analysed manually using Sequencing Analysis 5.2 (Applied Biosystems). 3.2.6 Bisulfite sequencing Bisulfite sequencing - the gold standard of DNA methylation analysis MSP relies on the match and mismatch of primers to bisulfite treated DNA. The method can however, in some instances, produce false positive results, exemplified by designing of primers that potentially anneal to non-bisulfite converted unmethylated cytosines. Thus, the samples may wrongly be scored as methylated. It is therefore necessary to ensure that the promoter area indeed is methylated, and that the MSP primers and qMSP primers and probe anneal to relevant regions of the promoter. A representative promoter region of GLDC and PPP1R14A was subjected to bisulfite sequencing in colon cancer cell lines. To calculate the approximate amount of methylation of each CpG site, the peak height of the cytosine signal was divided with the sum of cytosine and thymine peak height signals and multiplied by 100 to convert ratios to percentages. CpG sites with methylation frequencies ranging between 0% - 20% were classified as unmethylated, CpG sites with frequencies in the range of 21% - 80% were classified as partially methylated, and CpG sites with frequencies ranging from 81% - 100% were classified as hypermethylated. Designing bisulfite sequencing primers Unlike MSP primers, bisulfite sequencing primers must be designed so that they do not discriminate between methylated and unmethylated sequences, the primers should therefore be designed to anneal to sequences where there are no CpG sites. 48 Materials and methods The amplified fragment should cover the area amplified by the MSP primers. Additionally, monorepeats consisting of 9 or more bases should be avoided, as the enzyme polymerase can make slippage mistakes that may lead to insertions or deletions in the amplified product, eventually leading to errors in the sequence. The bisulfite sequencing primers were designed using Methyl Primer Express 1.0 (Applied Biosystems), and the reactions were optimised for magnesium concentration and annealing temperature to ensure equal amplification efficiency for the unmethylated and methylated DNA fragments. Bisulfite sequencing reaction Prior to bisulfite sequencing, a PCR reaction was performed and 5 µl product together with 1 µl loading buffer (1 x TAE buffer and 0.1% xylen cyanol) was loaded onto a 2% gel for visualization. Ten µl of the remaining product were purified by EXOSAP-IT. The sequencing reaction and post-sequencing Sephadex purification was performed as previously described. For the bisulfite sequencing reaction, dGTP BigDye Terminator v3.0 Cycle Sequencing Ready Reaction Kit (for GC-rich areas) was utilized. 3.2.7 Statistics The Pearson Chi-square ( χ2) and Fisher´s exact test are used to establish whether or not the observed frequency distribution differs from a theoretical distribution, given that the H0 hypothesis (no association between the variables) is true. The more the observed outcome diverges from the expected outcome, the less likely it is that the null hypothesis is true, thus giving a low P value. For this thesis, all 2 x 2 contingency tables were analysed using a two-sided Fisher´s exact test, whereas a two-sided Pearson Chi-square test was used on 2 x 3 and 2 x 4 contingency tables. P values less than or equal to 0.05 (5%) were considered significant. Binary regression analyses were used to examine possible association between DNA methylation and patient age. Receiver Operating 49 Materials and methods Characteristics (ROC) curves for individual genes were created using PMR values and tissue type (carcinoma and normal) as input. All calculations are derived from two-tailed statistical tests using the SPSS 16.0 software. 50 4. Results 4.1 Qualitative methylation analyses of candidate genes in vitro and in vivo Ten genes were analysed for promoter hypermethylation in colon cancer cell lines. The promoters of BNIP3, CBS, DDX43, GLDC, PEG10, PPP1R14A and WDR21B were frequently hypermethylated with frequencies of 14/20 (70%), 14/19 (74%), 17/19 (89%), 15/20 (75%), 18/20 (90%), 19/20 (95%) and 20/20 (100%). RBP7 displayed an intermediate level of methylation with a frequency of 9/20 (45%), while RASSF4 and IQCG displayed low levels of methylation, with frequencies of 1/19 (5%) and 0/20 (0%), respectively. These results are presented in Table 1 and Table 2. Several of the cell lines showed biallelic methylation (one allele is methylated whereas the other is unmethylated). However monoallelic methylation (both alleles are methylated) was found for three MSI cell lines (Co115, RKO, SW48) and three MSS cell lines (EB, HT29, IS2) across all ten genes. Table 1. Promoter methylation statuses of candidate genes in 20 colon cancer cell lines, stratified according to their MSI status. Abbreviations: MSI, microsatellite instability; MSS, microsatellite stable; U, unmethylated; M, methylated; U/M, partially methylated;ND, not determined 51 Results Table 2. Methylation frequencies among MSI and MSS colon cancer cell lines. Abbreviations: MSI, microsatellite unstable; MSS, microsatellite stable. The genes that were hypermethylated in cell lines were subjected to downstream in vivo methylation analysis in primary colorectal carcinomas. BNIP3, DDX43, GLDC, PEG10, PPP1R14A and WDR21B were methylated in 8/19, 19/20, 14/20, 18/19, 11/20 and 20/20 carcinomas, respectively. Additionally, normal mucosa samples from deceased, cancer-free individuals were analysed. DDX43, PEG10 and WDR21B were methylated in all normal samples, BNIP3 was methylated in 1/10 (10%) of the samples, whereas GLDC and PPP1R14A were unmethylated. The qualitative data from these analyses are summarised in Table 3. Conversely, GLDC showed a higher degree of methylation among MSI primary colorectal carcinomas than in MSS tumours (P = 0.05). 52 Results Table 3. Methylation frequencies in primary colorectal carcinomas and normal mucosa samples. Abbreviations: MSI, microsatellite unstable; MSS, microsatellite stable; CRC, colorectal cancer. 4.2 Quantitative PPP1R14A methylation profiles of GLDC and Conventional MSP analyses revealed that GLDC and PPP1R14A were among the most frequently hypermethylated genes in the qualitative pilot study. Furthermore, the results obtained indicated that the methylation profiles were cancer-specific, meaning that genes were differentially methylated in primary colorectal carcinomas versus normal mucosa. In order to validate these findings, GLDC and PPP1R14A were further investigated by real-time, quantitative PCR in a larger series of malignant and normal colorectal tissue samples, as well as in the 20 colon cancer cell lines that were analysed qualitatively. Promoter hypermethylation for GLDC and PPP1R14A were found in 28/47 (60%) and 27/47 (57%) of the primary colorectal tumours. The corresponding number in cell lines were 15/20 (75%) and 19/20 (95%), respectively. As previously mentioned, samples were scored as positive for methylation if the PMR value was > 3.5 for PPP1R14A and 2.5 for GLDC (see section 3.2.4). With these cut-off values, none of the normal mucosa samples for either gene were scored as methylated, resulting in 100% specificity for both assays. The results are visualised in Figure 12 and Figure 13. ¨ 53 Results Figure 12. Amplification plots displaying quantitative methylation measurements in colorectal carcinoma and normal mucosa samples. The upper part of the figure illustrates the successful amplification plots for GLDC in colorectal carcinoma and normal mucosa samples. The lower part of the figure shows the results for PPP1R14A. Fluorescence intensity (y-axis) is plotted versus the number of PCR cycles (x-axis).The red line indicates the cycle threshold (Ct), while the vertical, stapled line indicates the cutoff value (Ct=35). All PCR products with Ct values equal to or above 35 were censored (see material and methods). Figure 13. Box plots showing median PMR values as assessed by qMSP. The box plots show the distribution of PMR values according to the median, upper and lower quartiles. The lines inside the boxes denote median values whereas whiskers represent the interval between the 10th and 90th percentiles. Circles indicate outliers, and the star indicates an extreme outlier. 54 Results ROC curve analysis was applied to provide a statistical method to assess the diagnostic accuracy of the genes as biomarkers. GLDC had a sensitivity of 64% and a specificity of 100%, with an area under the curve (AUC) of 0.819 (P = 7 · 10-8). PPP1R14A had a sensitivity of 57.5% and a specificity of 100%, with an AUC of 0.792 (P = 8.59 · 10-7). The ROC curves visualise the unbiased trade-off between sensitivity and specificity. Interestingly, the sensitivity and specificity values obtained from the ROC curve analyses are concordant with the estimates we obtained by visual determination of cut-off values.The ROC curves are visualised in Figure 14 and Figure 15. Figure 14. ROC curve analysis from quantitative methylation-specific PCR results of GLDC. ROC curve was designed for the qMSP assay on the basis of PMR values for colorectal carcinomas (n = 47) and normal mucosa samples (n = 49). The AUC was 0.819, and the sensitivity and specificity were 64% and 100%, respectively. 55 Results Figure 15. Roc curve analysis from quantitative methylation-specifc PCR results of PPP1R14A. ROC curve was designed for the qMSP assay on the basis of PMR values of colorectal carcinomas (n = 47) and normal mucosa samples (n = 49). The AUC was 0.792, and the sensitivity and specificity were 57.5% and 100%, respectively. 4.3 Concordance of conventional MSP and quantitative real-time MSP The results of qMSP analyses were compared with those obtained by conventional MSP in colon cancer cell lines (n = 20), primary tumours (n = 11) and normal tissue samples (n = 8). While conventional MSP scores samples as methylated, partially methylated or unmethylated for cell lines, and methylated or unmethylated for tissue samples, qMSP data gives a quantitative measurement of DNA methylation levels ranging from 0 to 100 (Figure 16). The cut-off values of 2.5 for GLDC and 3.5 for PPP1R14A resulted in good concordance between data obtained from qMSP and conventional MSP analyses. For GLDC, 39/39 (100%) of the samples were concordant (P = 0.000). PPP1R14A, however, had one sample which was scored as unmethylated from qualitative, gel-based MSP and as methylated from the quantitative real-time MSP analysis. Consequently, the methylation status was in 56 Results agreement for 38/39 (97%) of the samples (P = 2 · 10-9). The results are illustrated in Figure 17 and Figure 18 and and summarised in Table 4 and Table 5. Figure 16. Comparativ analysis of convential MSP and qMSP results. Qualitative MSP scores are compared with quantitative PMR values as a proof-of-principle. The figure reveals that there is a concordance between the data obtained by the two methods. Red indicates highly methylated samples, while blue indicates unmethylated samples. The figure illustrates the quantitative differences between traditional MSP and qMSP. While MSP samples are scored as either unmethylated (blue) or methylated (red), PMR values reveal the absolute quantitative value of a sample. NB, normal blood (positive control for unmethylated samples); IVD, in vitro methylated DNA (positive control for methylated samples). 57 Results Figure 17. Comparison of scores obtained from conventional MSP and quantitative MSP analysis of GLDC. Box-plots denote the PMR values (y-axis) determined for 39 samples by qMSP to the scores obtained by conventional MSP unmethylated (green) and methylated (red) samples (x-axis). The dashed line represents the cut-off value (PMR = 2.5). Conventional MSP Unmethylated Methylated Total Quantitative real-time MSP with cut-off = 2.5 Unmethylated Methylated Total 15 0 15 0 24 24 15 24 39 Table 4. Concordance of classification of the GLDC status by the two methods. 58 Results Figure 18. Comparison of scores obtained from conventional MSP and quantitative MSP of PPP1R14A. Box-plots denote the PMR values (y-axis) determined for 39 samples by qMSP to the scores obtained by conventional MSP unmethylated (green) and methylated (red) samples (x-axis). The dashed line represents the cut-off value (PMR = 3.5). Conventional MSP Unmethylated Methylated Total Quantitative real-time MSP with cut-off = 3.5 Unmethylated Methylated Total 11 1 12 0 27 27 11 28 39 Table 5. Concordance of classification of the PPP1R14A status by the two methods. 59 Results 4.4 Bisulfite sequencing confirms the promoter methylation status of GLDC and PPP1R14A Bisulfite sequencing of GLDC and PPP1R14A in colon cancer cell lines showed that all non-CpG cytosines were fully converted to thymine. These results, along with detailed sequencing results and MSP status are shown in Figure 19 and Figure 20. In general, the majority of the cell lines that were scored as fully methylated by MSP, were also fully methylated from the bisulfite sequencing analyses. For GLDC, a good association was seen between the MSP scores and bisulfite sequencing results. Similarly, a good association was also seen for the MSP scores and bisulfite sequencing analyses of PPP1R14A, except for V9P. This cell line was scored as partially methylated by MSP analyses of PPP1R14A; however, all CpG sites were unmethylated from analyses of the bisulfite sequencing electropherograms. 60 Results Figure 19. Bisulfite sequencing verifies site specific methylation within the GLDC promoter. A) The upper part of the figure is a schematic presentation of the CpG sites amplified by bisulfite sequencing primers. The arrows indicate the location of the MSP primers, transcription start site is represented by +1 and the vertical bars indicate the location of the individual CpG sites. For the lower part of figure A, filled circles represent methylated CpGs; open circles represent unmethylated CpGs; and grey circles represent partially methylated CpG sites. The right column of U, M and U/M lists the methylation status of the cell lines as assessed by MSP analyses. B) Representative bisulfite sequencing electropherograms of the GLDC promoter in colon cancer cell lines. A subsection of the reverse complimentary bisulfite sequence electropherogram, covering CpG sites -1 to -4 relative to transcription start site. Cytosines residing in CpG sites are indicated by a lollipop, whereas cytosines residing in non-CpG sites are underlined. Black lollipop indicates methylated CpGs, whereas open lollipop indicates unmethylated CpGs. The GLDC promoter sequencing electropherogram illustrated here are from the unmethylated Colo320 cell line and the hypermethylated IS1 cell line. 61 Results 62 Results Figure 20. Bisulfite sequencing verifies site specific methylation within the PPP1R14A promoter. A) The upper part of the figure is a schematic presentation of the CpG sites amplified by bisulfite sequencing primers. The arrows indicate the location of the MSP primers, transcription start site is represented by +1 and the vertical bars indicate the location of the individual CpG sites. For the lower part of figure A, filled circles represent methylated CpGs; open circles represent unmethylated CpGs; and grey circles represent partially methylated CpG sites. The right column of U, M and U/M lists the methylation status of the cell lines as assessed by MSP analyses. B) Representative bisulfite sequencing electropherograms of the PPP1R14A promoter in colon cancer cell lines. A subsection of the reverse complimentary bisulfite sequence electropherogram, covering CpG sites -22 to -15 relative to transcription start site. Cytosines residing in CpG sites are indicated by a lollipop, whereas cytosines residing in non-CpG sites are underlined. Black lollipop indicates methylated CpGs, whereas open lollipop indicates unmethylated CpGs. The PPP1R14A promoter sequencing electropherogram illustrated here are from the hypermethylated RKO and EB cell lines. 4.5 Association of tumour methylation with genetic and clinico-pathological features DNA methylation status for GLDC and PPP1R14A were compared with genetic and clinico-pathological features of the tumours. DNA methylation frequencies were higher among MSI tumours, statistically significant for PPP1R14A (P = 0.001). DNA methylation of both genes was associated with proximal location, however, only statistically significant for PPP1R14A (P = 0.0008). Tumour methylation was associated with wild-type BRAF with P = 0.015 for GLDC and P = 0.0004 for PPP1R14A. There was no association between mutation of KRAS, PTEN, PIK3CA, TP53 and methylation of either gene. No significant association was found between DNA methylation and clinico-pathological data such as Dukes´ staging or sex of the patient. No association was observed between patient age and tumour methylation for PPP1R14A. However, binary regression analysis showed an increase in age among colorectal cancer patients who were positive for methylation of GLDC (mean 74, 95% confidence interval 77.98-70.09, Std deviation 10.163) compared with patients who were negative for methylation (mean 64.7, 95% confidence interval 58.18-71.19, Std deviation 13.499; P = 0.02). Co-methylation of the two genes was seen in 34/47 samples and was not associated with Duke´s staging, or sex or age of the patient. Not surprisingly, co-methylation was more frequent in MSI tumours (P = 0.05) and tumours harboring wild-type 63 Results BRAF (P = 0.021). Furthermore, carcinomas located proximal in the colon showed more frequent co-methylation than carcinomas located in the distal colon (P = 0.018). Co-methylation was not associated with mutation of KRAS, PTEN, PIK3CA or TP53. 64 5. Discussion 5.1 Methodological considerations 5.1.1 Methylation-specific polymerase chain reaction The method by which converted DNA is analysed can influence the interpretation of the methylation status of the DNA. For this thesis, both qualitative and quantitative methylation-specific PCR was applied. Differences between these methods lie in detection of the amplified PCR products, as well as in the assay design. In the first cycles of a PCR reaction, all samples will amplify exponentially. However, the reaction will eventually reach a plateau phase caused by i.e depletion of reagents and product renaturation competing with primer binding. The point at which the reaction reaches the plateau phase varies between samples, and might even vary between replicates. Consequently, samples that started out with different amount will reflect the same quantity when measured at the end phase. To precisely determine the initial sample quantity, measurements from the exponential phase should be used. Traditional MSP is based on detection of the end-product using agarose gel stained with an intercalating dye, such as ethidium bromide. Furthermore, end-point detection is based on visual estimation of the quantitatity of the target sequence. In contrast to traditional MSP, quantitative MSP measures the data at the exponential phase of the PCR reaction and the need for post-PCR processing, such as staining and separation on agarose gel, is eliminated. The MSP and qMSP assays are designed to recognise the fully methylated version of the sequence, and the results can therefore be considered as highly conservative. However, the qMSP assay also contains an oligonucleotide probe, which ensures an even greater degree of specificity for the methylated target sequence. 65 Discussion Traditional MSP assay serves as a robust and sensitive screening process to discover genes that have high methylation frequencies in cell lines and primary tissues. The method does not include fluorescence and is therefore less expensive than qMSP. A high concordance between the quantitative and qualitative analyses of GLDC and PPP1R14A indicates that the data obtained from qMSP and MSP analyses were highly concordant. Even though the scoring of traditional MSP is visual, the good concordance with the qMSP results indicates that the band intensities indeed are semi-quantitative. Sample 1047 was scored as methylated from the qMSP analyses, while the sample was scored as unmethylated from MSP analyses. This emphasizes the increased sensitivity of real-time quantitative analyses, which detects one methylated allele in a pool of 10,000 unmethylated alleles. In comparison, conventional MSP detects one methylated allele in a pool of 1,000 unmethylated alleles. In qMSP, each sample must be normalised for input DNA, using a CpGindependent, bisulfite specific control. Single-copy housekeeping genes have traditionally been used for this purpose. However, rearrangements, duplications and deletions of genes or chromosomes are frequent events in human cancers and may profoundly affect the PCR yield for these genes. In the present thesis, a part of the ALU repetitive element was used as an internal reference. As apposed to the singlecopy genes, ALU is present in approximately 1 million copies in a haploid genome, and is dispersed throughout the genome. Copy number changes due to the alterations mentioned above are therefore less likely to affect this internal control [38]. The normalised quantity value of the samples must be compared with a fully methylated human genomic DNA sample. For this thesis, commercially available in vitro methylated DNA (IVD), which is methylated at minimum 99% of the CpG sites in the genome, was utilised as a positive control for methylation. Hence, the PMR values reflect the degree of methylation per sample according to the IVD. A 66 Discussion few of the PMR values obtained for the GLDC and PPP1R14A assays were greater than 100%. A possible explanation for this could be incomplete SssI treatment of the reference sample. The assay could incidentally be designed to cover a CpG site which is unmethylated in the reference IVD sample, but methylated for the genes of interest. Consequently, the degree of methylation for the genes will be higher than the reference sample. None of the assays amplified bisulfite-treated unmethylated DNA (normal blood) or non-template controls (water), confirming the specificity of the assays. 5.1.2 Bisulfite sequencing Bisulfite sequencing serves two purposes when analysing samples for DNA methylation. As mentioned in section 3.2.6, MSP analysis may produce false positive results due to amplification of unconverted, unmethylated cytosines, and bisulfite sequencing is performed to verify the methylation statuses. Second, bisulfite sequencing confirms whether or not the MSP assay is designed to amplify a representative region of the gene promoter. The importance of MSP study design and verification by DNA sequencing has been exemplified by articles on DNA methylation of the MAL tumour-suppressor gene. Mori et al reported a methylation frequency of 6% in colorectal carcinomas, while a study published by Lind et al found hypermethylation of the MAL promoter in approximately 80% [102-104]. While Mori and co-workers designed primers that were located a couple of hundred base pairs upstream of the transcription start site, the assay designed by Lind et al included primers located very close to the transcription start site. In the present thesis, a concordance was seen between the MSP results and the methylation statuses of individual CpG sites as assessed by bisulfite sequencing. The cell line V9P was scored as partially methylated from qualitative analyses and as methylated from quantitative MSP analyses of PPP1R14A. However, the bisulfite sequencing electropherograms revealed a completely unmethylated region. This discrepancy could not be caused by poor primer design, as both MSP and 67 Discussion qMSP (with an increased specificity due to the probe) gave the same score. If the bisulfite conversion of V9P was sub-optimal, unmethylated cytosines would be wrongly scored as methylated in the subsequent MSP analysis. However all nonCpG cytosines were deaminated to uracil and amplified as thymine, confirming that the bisulfite modification indeed was successful. All of these factors mentioned above highlights the importance of optimal design and optimisation of the assays. A possible solution to the above mentioned problem would be to clone the PCR product. Direct sequencing, such as bisulfite sequencing, gives an average value for methylation of the analysed sample, ensuring a representative methylation profile of the sample. However, cloning of the PCR product and subsequent sequencing of the individual clones will give a more accurate profile, as individual clones might contain different degrees of methylation. This approach is therefore suitable to elucidate the level of methylation heterogeneity in a sample 5.2 Cell lines versus solid tumours A cell line is a permanently established cell culture that in contrast to normal cells that reach senescense will proliferate indefinitely. Human cancer cell lines resemble the phenotypic, genetic and epigenetic characteristics of their original tumour, and are consequently important experimental tools in understanding the behaviour of the primary tumours. Cancer cell lines are easy to culture in vitro, they yield substantial amounts of high-quality DNA and RNA, and as opposed to heterogeneous primary tumours, cancer cell lines are usually not “contaminated” by normal cells [105]. An additional advantage is the commercial availability of a broad series of tumour types. A number of immortalisation methods have been developed in order to obtain permanent cell lines, including exposure to a DNA tumor virus such as SV40 virus, EBV, and papilloma virus, cell fusion between the cell with a limited lifespan and a permanent cell line and treatment with carcinogenic chemicals [106]. Conversely, 68 Discussion differences between subtypes within a cell line population can develop. When cells are grown over numerous generations, faster growing cells will eventually predominate over the other cells. If these clones have acquired new genetic and/or epigenetic characteristics, a biased selection could occur, resulting in a cell line population that is less representative of the original tumour. Furthermore, exposure to environmental variations, such as relocation to other laboratories with different temperatures, may lead to a selection of cell lines that thrive better under specific conditions. For this thesis, cell lines were initially cultured with AZA and/or TSA and the gene expression levels were measured. The cell lines were subsequently utilised in a pilot analyses to determine which genes that should be subjected to methylation analyses in tissue samples, which is a unique and valuable material. Several studies have reported a higher prevalence of DNA methylation in cancer cell lines when compared to matching primary tumours, while others have seen an equal extent of methylation [107,108]. Notably, colon cancer cell lines are among the cell lines that most closely resemble their matching primary tumours. We therefore expect the resulting cell line methylation frequencies to be a good indication of the frequency in primary tumours. 5.3 Novel epigenetically deregulated genes in colorectal cancer In the present study a comprehensive, genome-wide microarray analysis in combination with downstream methylation analysis was used to identify target genes inactivated by DNA hypermethylation. All ten genes analysed in the present study, were upregulated in colon cancer cell lines after treatment with demethylating chemicals. All of the genes were further found to be methylated in colon cancer cell lines, with the exception of IQCG, and RASSF4 (partially methylated in a single cell line). These genes might be considered as “false positives” in our resulting gene list. This apparently false response to the epigenetic 69 Discussion drug treatment might be due to an associated hypermethylated enhancer element, or possibly a cellular response to the toxic effect of the chemicals. However, our approach is generally very successful and has a lower rate of false positives when compared with other studies using variants of the same approach. This indicates that our fairly conservative approach is highly effective. Six of the ten genes analysed in the current project were hypermethylated in cell lines, and were subjected to methylation analyses in primary tumours and normal tissue samples. Validation analyses were performed on the two most promising genes, GLDC and PPP1R14A. Both of these genes displayed a frequent and cancerspecific methylation profile. Notably, the genes presented in this thesis were not as promising as biomarkers as other genes previously identified in our lab. A possible explanation for this could be that gene expression analyses which identified the previous biomarkers were compared with cDNA gene expression microarrays, while the candidate genes in this project were compared with the AB1700 microarray platform. The same approach was used; however, a lower concentration of AZA was used during cell culture to avoid possible cytotoxic effect of the drug. BNIP3, a pro-apoptotic member of the Bcl-2 family, has been shown to be frequently methylated in gastric and colorectal cancer. The expression of the gene is induced in hypoxic regions of tumours. The presence of hypoxic regions in tumours is often associated with poor prognosis, because hypoxic tumour cells can develop resistance to chemotherapy and radiation treatment [109,110]. Although tumours often contain hypoxic regions, the cells of the tumours must survive and continue to grow. One possible mechanism by which the tumour accomplishes this, is by induction of hypoxia-inducible factor-1 (HIF-1), which upregulates genes that are involved in glycosis, angiogenesis and cell survival. However, HIF-1 has also been shown to have a pro-apoptotic function, by activating apoptotic signalling pathways involving HIF-1-mediated expression of BNIP3 [109]. Over-expression of BNIP3 leads to opening of the mithocondrial permeability transition pore, thereby abolishing the proton electrochemical gradient, which is followed by chromatin 70 Discussion condensation and DNA fragmentation [110]. Thus, the cancer cells must find a way to escape this apoptotic function. Inactivation of BNIP3 by promoter hypermethylation could be one mechanism by which the cancer cells can escape apoptosis. The cancer-specific methylation frequencies observed in the present thesis indicates that BNIP3 possibly contributes in tumourigenesis of colorectal cancer, possibly by limiting the susceptibility of cancer cells to hypoxia-induced apoptosis. PEG10 is a maternally imprinted and paternally expressed gene, which has been reported to be overexpressed in hepatocellular carcinoma and B-cell chronic lymphocytic leukemia [111]. It has been suggested that the PEG10 protein may block TGF-β signalling by binding to TGF-β receptor II [112]. The transforming growth factor β (TGF-β) pathway is important in several cellular processes, including cell growth, cell differentiation and apoptosis. In relation to cancer, it is possible that the genes associated with this pathway are subjected to different selective pressure, and are selected for loss of function at early stages but gain of function in late stages of the disease. In this context, inactivation of PEG10 in late cancer would lead to gain of function for the genes involved in the pathway. These findings suggests that PEG10 inactivation might be of general importance in late cancer, as well as in normal tissue, as promoter hypermethylation is found in 95% of the colorectal carcinomas and 100% of the normal mucosa samples analysed in the present thesis. DDX43 was to our surprise a testis cancer antigen (TCA). Generally, we would prefer to avoid such genes by not selecting them for downstream methylation analyses. The reason for this is because, as briefly mentioned in the introduction, TCAs are usually not expressed in normal tissue, with the exception of the testis, due to hypermethylation of the gene promoter [113]. The protein encoded by this specific gene is an ATP-dependent RNA helicase in the DEAD-box family. It is well understood that dysregulation of the molecules that participate in RNA processing can potentially affect normal cellular homeostasis and contribute to 71 Discussion cancer development and/or progression. Based on the findings found in the current project, we propose that hypermethylation of DDX43 is an important feature of normal mucosa and colorectal cancer. In the present thesis we report that hypermethylation of WDR21B, is present in tumour and normal samples. WDR21B codes for a protein containing a conserved WD40 domain. The domain covers a wide range of functions including regulation of signal transduction, pre-mRNA processing and cytoskeleton assembly. All of these cellular processes are important in cancer development, and alterations in either of them can give the cancer cell selective advantages. The data provided in this thesis presents GLDC and PPP1R14A as novel hypermethylated target genes in colorectal cancer. To our knowledge, GLDC and PPP1R14A have not been reported to be silenced through promoter hypermethylation in any physiological or biological context. GLDC codes for the enzyme system needed to cleave glycine into smaller pieces. The system is composed of four protein components and a defect in either one of these components may cause glycine encephalopathy (OMIM #238300). Mutations in GLDC accounts for most of the cases for glycine encephalopathy [114]. The breakdown of excess glycine is necessary for the normal development and function of nerve cells in the brain and spinal cord. Furthermore, dietary glycine has been found to inhibit angiogenesis during tumour growth, indicating that glycine is a necessary precursour in building new blood vessels [115]. Some GLDC mutations may lead to the production of a nonfunctional version of the glycine cleavage system, thus preventing the system from breaking down glycine. Inactivation of GLDC due to promoter hypermethylation may give the same consequences. As a result, excess glycine can build up, and cancer cells can exploit this in the process of angiogenesis. PPP1R14A on the other hand is a phosphorylation-dependent inhibitor of smooth muscle myosin phosphatase. Inhibition of PPP1R14A leads to increased myosin phosphorylation and enhances smooth muscle contraction in the absence of 72 Discussion increased intracellular Ca2+ concentration (PPP1R14A, OMIM #608153). Smooth muscles are found in blood vessels where it regulates the flow of blood in arteries. The cancer-specific hypermethylation of PPP1R14A observed in this thesis underlies the importance of smooth muscle contractions in tumour blood flow, which is necessary in order to supply the cancer cells with oxygen and other factors needed to survive and grow. Furthermore, the smooth muscle contractions may also help to travel cancer cells to other parts of the body during metastasis. Single cancer cells can break away from an established solid tumor, enter the blood vessel, and be carried to a distant site by muscle contractions, where they can implant and begin the growth of a secondary tumor. It has been debated whether promoter hypermethylation play a direct causal role in tumour development and progression, or if it is merely a consequence of abnormal phenotype of cancer cells. Several lines of evidence suggest that DNA methylation play a direct and causative role in tumourigenesis. This is exemplified by the discovery of promoter hypermethylation in aberrant crypt foci, underlining that epigenetic events occur early in colorectal cancer [116]. Furthermore, reduced DNA methylation was found to suppress the formation of intestinal polyps in mice [117]. Additionally, sporadic cases of colorectal cancer displayed high frequencies of deregulation of the mismatch repair gene hMLH1. Treatment of cell lines with AZA resulted in re-expression of hMLH1 and restoration of the mismatch repair ability, indicating that inactivation of hMLH1 was the primary inactivating event [118]. All of these results support the fact that DNA hypermethylation serves as one of the initiating events contributing to tumourigenesis. However, it is important to highlight that the effect of DNA hypermethylation and its contribution to cancer development is gene-specific, meaning that the function of the gene determines whether inactivation by hypermethylation is important in cancer development and progression. As mentioned previously, our objective with this study was to identify biomarkers for colorectal cancer. In this setting, cancer-specific methylation (defined as the 73 Discussion unique presence of methylation in cancer cells) has been shown to be highly suitable. Age-specific methylated target genes, such as ER, MyoD and N33, are found to be methylated also in normal tissue in elderly patients and the methylation increases with age. This non-cancer specific methylation is thereby not suitable for determining the presence of cancer cells. This is because some tumour-suppressor genes become aberrantly methylated after exposure to environmental factors and aging. Consequently, these factors could lead to false positives in the study of tumour-specific methylation in colorectal cancer. We did not observe statistically significant associated age-related methylation for PPP1R14A in the current project. However, binary regression analysis revealed a statistical significant association between methylation of GLDC and elderly patients. It is important to emhasize that the observed association does not indicate that the methylation is age-specific, as this is defined by methylation of normal tissue samples. Conversely, all of the normal samples were unmethylated for GLDC. Both of the genes analysed were more frequently methylated in MSI tumours than in MSS tumours. Furthermore, because MSI tumours are generally more common in the proximal colon, it is not surprising that we found an association between gene-specific methylation of GLDC and PPP1R14A and the proximal colon. This indicates that hypermethylation of these genes could be more important in proximal colon tumourigenesis than in distal colon tumourigenesis. 5.4 Early detection and diagnostics Screening for colorectal cancer (CRC) has been shown to reduce cancer-related death by detecting early stage CRC and pre-malignant lesions. As mentioned in the introduction, survival among CRC patients depends on tumour stage at time of diagnosis. This is because discovering the patients at a stage where the tumour is localised would mean that the majority of the patients could be cured by surgery alone. Consequently, detection of early stage CRC can increase survival rate dramatically. Several countries, such as Germany and Italy, are using colonoscopy 74 Discussion as the primary screening tool. This is a method that accurately can detect early cancerous lesions and is by many considered to be the gold standard of colorectal cancer screening. However, the invasive nature of the procedure limits its effectiveness, as well as the need for a skilled physician performing the procedure. In contrast, a non-invasive screening test, using patient’s blood or fecal samples would be simple and more cost effective, and also increase patient compliance. The ideal samples for early diagnosis are material collected in a non-invasive way, which at the same time contains methylated DNA. Stool blood test, also known as faecal occult blood test (FOBT), detect the presence of occult blood in the stool. FOBT has shown to reduce CRC-related mortality with 15% to 33%, with reported sentivities ranging from 5% to 98% [119]. It is however important to emphasize that presence of blood in the stool could derive from other intestinal or gastric changes than CRC, which leads to a low specificity for the FOBT. In addition, biannual testing is recommended because adenomatous polyps do usually not bleed and bleeding from larger polyps or cancers may not always be detectable in a single stool sample [120]. However, cells of the colon are continually exfoliated and shed into the stool. Genomic DNA can be isolated from these cells and alterations in DNA methylation patterns can be analysed. When determining the efficiency of a biomarker, the sensitivity and specificity must be taken into consideration. The sensitivity refers to the proportion of individuals with confirmed disease, who test positive for the biomarker. The specificity refers to the proportion of individuals without the disease, who test negative for the biomarker. These two measures can serve as the selection criteria when determining which potential biomarkers should be further investigated. An ideal biomarker assay for detection of DNA methylation, would be highly sensitive and specific [20]. Molecular biomarkers represent a promising non-invasive approach in detecting cancer, determining prognosis and monitoring disease progression or therapeutic response [121]. 75 Discussion Several studies have identified DNA methylation markers with a diagnostic potential in colorectal cancer. VIM is a promising biomarker with a sensitivity of 73% and a specificity of 86 % when analysed in fecal DNA samples. However, when combined with a DNA integrity assay (measures long DNA stretches, more abundant in patients with a tumour) the sensitivity increased to 88% whereas the specificity was 82%. A single marker test for VIM promoter hypermethylation is today the only commercially available stool DNA test. However, other research groups have reported a lower sensitivity and specificity for methylation of VIM in fecal samples, reducing its potential to be used as a clinical biomarker. Hypermethylation of SFRP2 in stool from patients with CRC have been reported to have a sensitivity of 77% to 90%, with a specificity of 77% [122]. A recent article published by Ahuja et al presents data indicating that aberrant methylation of TFPI2 is a frequent and early event in CRC tumourigenesis. Hypermethylation of TFPI2 was detected in 97% of the adenomas and 99% of the carcinomas. Interestingly, stool methylation analysis revealed a sensitivity of 76% to 89%, and a specificity of 79% to 93% [123]. DNA methylation analysis in blood samples represents another promising noninvasive approach for early diagnosis. Cancer-specific DNA methylation can be detected in circulating tumor DNA. Studies have reported an elevated level of free DNA in serum of cancer patients, which is most likely due to DNA released from necrotic tumour cells [20]. Abberant DNA methylation can be detected in circulating DNA, and is minimally invasive to obtain from patients. However, there are several disadvantages associated with DNA methylation biomarkers in blood. It is questioned whether there is enough methylated DNA present in the blood to efficiently detect tumours at an early stage. Moreover, blood is not organ-specific, meaning that methylated DNA in the bloodstream could point to cancer in any of several organs [124]. So far, hypermethylation of SEPT9 is the most promising blood based CRC methylation biomarker with a sensitivity of 72% and a specificity of 90%. Methylation of SEPT9 was also detected in some patients who had large polyps, although with a sensitivity of only 20%. This implies that the low sensitivity 76 Discussion of SEPT9 reduces its ability to be utilised as a diagnostic tool for early detection [125]. Interestingly, recent research has focused on the elevated levels of certain miRNAs in the plasma of CRC patients. A study by Sung and co-workers identified significantly higher level of miR-92 in such patients, with a reported sensitivity and specificity of 89% and 70% [126]. The biomarkers that have been identified in the literature are promising; however more biomarkers need to be identified in order to increase the sensitivity and specificity. A promising biomarker panel consisting of six methylated genes with a high sensitivity and specificity for CRC and adenomas have recently been identified in our lab (Lind et al, unpublished, [102,103]). The combined panel has a sensitivity of 93% in CRC and 88% in adenomas, and a specificity of 99% and has therefore the potential of performing well in early detection of colorectal tumors. Stool samples are currently being analysed, and if the high sensitivity and specificity measurements are validated in these non-invasive samples, this biomarker panel could be a suitable assay for a non-invasive methylation-based fecal test. The two novel methylated gene targets presented in this thesis were not highly sensitive. However, the genes were 100% specific, which implies that they can be included in a diagnostic test together with other genes that are highly sensitive and specific, thus increasing the robustness of the test. It would also be an advantage if the cancer-specific methylation profile of the genes were not present in other cancer types or inflammatory conditions, as this may reduce the specificity of the test. The choice of threshold for scoring of methylated samples in quantitative methylation-specific analysis will ultimately affect the sensitivity and specificity of an assay. Setting a high threshold for a cancer-specific marker will increase the specificity, but reduce the sensitivity. On the contrary, setting the threshold lower will increase the sensitivity, but reduce the specificity. A high specificity is preferred in diagnostic tests because a low specificity will increase the number of false positives, subsequently affecting not only the patient’s life-quality, but also lower the cost-efficiency due to follow-up of these patients. Receiver Operating 77 Discussion Characteristics (ROC) curve is a statistical tool which is used to evaluate the performance of a biomarker assay. ROC curves are also a useful tool to guide the choice of threshold values in order to reach the most optimal sensitivity and specificity. The area under the ROC curve (AUC) is a measure of the ability of a biomarker to accurately classify tumour and non-tumour tissue (normal tissue) [20]. DNA methylation as a biomarker has several advantages over other molecular biomarkers, such as RNA and protein (see section 1.4 in introduction). Several DNA methylation changes have been shown to occur early in carcinogenesis and are therefore good biomarkers for early cancer development. Furthermore, promoter hypermethylation of important tumour-suppressor genes is a key event in human cancers and is often associated with transcriptional silencing. Research indicates that each type of human cancer is associated with a distinct methylation profile, which can be used to identify the tissue of origin from a particular neoplasm. In addition, DNA methylation is easier to handle methodologically than RNA followed by reverse-transcription PCR, mainly because DNA is more stable than RNA and protein [95]. An advantage of DNA methylation over protein-based markers is that DNA is readily amplifiable and easily detectable, whereas protein can not be amplified. Furthermore, cancer-specific mutations can occur anywhere in a gene, while DNA methylation usually occurs in defined regions (promoter region) of a gene. 78 6. Conclusions Using a combined approach of microarray analysis and in vitro, in vivo and in silico analysis we have successfully identified GLDC and PPP1R14A as novel epigenetically deregulated genes in colorectal cancer. Both genes were unmethylated in normal mucosa samples, and frequently methylated in colorectal tumors, resulting in a sensitivity of 60% and 57%, respectively, and a specificity of 100%. 79 7. Future perspectives Time of methylation onset during tumor development for the two novel genes The two novel methylated target genes in CRC identified in the present thesis will additionally be analysed in colorectal adenomas in order to see whether they are early changes in the tumorigenesis or markers for fully developed carcinomas. Diagnosis of CRC at an early stage can dramatically improve the patient survival; therefore, our main aim in previous studies has been to identify early changes. However, markers classifying whether tumors are malignant or benign would also be highly interesting in a clinical setting. Are the two novel genes methylated in various cancer types? To investigate whether the candidate genes are specific for colorectal cancer, or if they are epigenetically deregulated across several cancer diseases in the gastrointestinal tract, a series of cancer cell lines derived from different tissues will be analysed and the methylation profiles will be compared. The methylation frequency should preferably be specific for colorectal cancer; however, the discovery of epigenetic master keys for the gastrointestinal tract will certainly be of interest. The functional significance of the two novel genes? We will apply reverse transcription-PCR, which measures the mRNA level transcribed from the gene, to confirm whether hypermethylation is associated with reduced or loss of gene expression, and if treatment with AZA leads to reexpression. We will also consider performing functional studies to explore if loss of protein expression has any effect on cell growth. This is important to determine if methylation of the genes has a direct role in driving the tumourigenesis, or if they are mere passengers in the process. 80 Future perspectives Additional new epigenetic markers in CRC In the present thesis, candidate genes were selected on the basis of a microarray approach where colon cancer cell lines were cultured with both a demethylating agent (AZA) and a histone deacetylase (TSA). This approach has previously been utilised in our lab, and has resulted in the discovery of promising biomarkers with high sensitivity and specificity. In the present study, we did not get the methylation frequencies as high as expected; possibly due to low concentration treatment of the cell lines (see discussion). However, the criteria set to select candidate genes obviously influence the resulting gene list, and we may identify target genes with higher methylation frequencies by changing these criteria. Additionally, colon cancer cell lines have also been cultured with either AZA or TSA, and it would be of interest to compare the gene lists from the combined and individual treatment strategies and analyse potential novel candidate genes generated also from these individual treatments. The pipeline will include the same technical validation strategy as introduced in this thesis, including qualitative and quantitative MSP analyses, as well as bisulfite sequencing, of cell lines and clinical samples. If the biomarkers have high methylation frequencies in adenomas and carcinomas, while at the same time being unmethylated in normal samples, fecal samples may be analysed to determine the sensitivity and specificity in noninvasive material. Building an optimal epigenetic biomarker panel for non-invasive testing Using only markers identified from own lab imply the possibility to participate in development of a marker set for screening purposes and for monitoring of CRC patients. The biomarker panel identified in our lab may possibly be improved by including GLDC and/or PPP1R14A. 81 8. Reference list 1. Nowell PC: The clonal evolution of tumor cell populations. Science 1976, 194: 23-28. 2. Heim S, Teixeira MR, Dietrich CU, Pandis N: Cytogenetic polyclonality in tumors of the breast. Cancer Genet Cytogenet 1997, 95: 16-19. 3. Al-Hajj M, Wicha MS, ito-Hernandez A, Morrison SJ, Clarke MF: Prospective identification of tumorigenic breast cancer cells. Proc Natl Acad Sci U S A 2003, 100: 3983-3988. 4. Wicha MS, Liu S, Dontu G: Cancer stem cells: an old idea-a paradigm shift. Cancer Res 2006, 66: 1883-1890. 5. Vogelstein B, Kinzler KW: Cancer genes and the pathways they control. Nat Med 2004, 10: 789-799. 6. Knudson AG, Jr.: Mutation and cancer: statistical study of retinoblastoma. Proc Natl Acad Sci U S A 1971, 68: 820-823. 7. Teodoridis JM, Strathdee G, Brown R: Epigenetic silencing mediated by CpG island methylation: potential as a therapeutic target and as a biomarker. Drug Resist Updat 2004, 7: 267-278. 8. Waddington C: The Epigenotype. Endeavour 1942, 1: 18-20. 9. Vargas AO: Did Paul Kammerer discover epigenetic inheritance? A modern look at the controversial midwife toad experiments. J Exp Zool B Mol Dev Evol 2009, 312: 667-678. 10. Pennisi E: History of science. The case of the midwife toad: fraud or epigenetics? Science 2009, 325: 1194-1195. 11. Herceg Z, Hainaut P: Genetic and epigenetic alterations as biomarkers for cancer detection, diagnosis and prognosis. Mol Oncol 2007, 1: 26-41. 12. Feinberg AP, Vogelstein B: Hypomethylation distinguishes genes of some human cancers from their normal counterparts. Nature 1983, 301: 89-92. 13. Esteller M: Epigenetics in cancer. N Engl J Med 2008, 358: 1148-1159. 14. Sved J, Bird A: The expected equilibrium of the CpG dinucleotide in vertebrate genomes under a mutation model. Proc Natl Acad Sci U S A 1990, 87: 4692-4696. 15. Gardiner-Garden M, Frommer M: CpG islands in vertebrate genomes. J Mol Biol 1987, 20;196: 261-282. 16. Herman JG, Baylin SB: Gene silencing in cancer in association with promoter hypermethylation. N Engl J Med 2003, 20;349: 2042-2054. 17. Takai D, Jones PA: Comprehensive analysis of CpG islands in human chromosomes 21 and 22. Proc Natl Acad Sci U S A 2002, 19;99: 3740-3745. 18. Li E: Chromatin modification and epigenetic reprogramming in mammalian development. Nat Rev Genet 2002, 3: 662-673. 82 Reference list 19. Okano M, Bell DW, Haber DA, Li E: DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development. Cell 1999, 99: 247257. 20. Laird PW: The power and the promise of DNA methylation markers. Nat Rev Cancer 2003, 3: 253-266. 21. Feinberg AP, Tycko B: The history of cancer epigenetics. Nat Rev Cancer 2004, 4: 143153. 22. Karpf AR, Matsui S: Genetic disruption of cytosine DNA methyltransferase enzymes induces chromosomal instability in human cancer cells. Cancer Res 2005, 65: 86358639. 23. Suter CM, Martin DI, Ward RL: Hypomethylation of L1 retrotransposons in colorectal cancer and adjacent normal tissue. Int J Colorectal Dis 2004, 19: 95-101. 24. Bariol C, Suter C, Cheong K, Ku SL, Meagher A, Hawkins N et al.: The relationship between hypomethylation and CpG island methylation in colorectal neoplasia. Am J Pathol 2003, 162: 1361-1371. 25. Feinberg AP, Vogelstein B: Hypomethylation of ras oncogenes in primary human cancers. Biochem Biophys Res Commun 1983, 111: 47-54. 26. Greger V, Debus N, Lohmann D, Hopping W, Passarge E, Horsthemke B: Frequency and parental origin of hypermethylated RB1 alleles in retinoblastoma. Hum Genet 1994, 94: 491-496. 27. Esteller M, Corn PG, Baylin SB, Herman JG: A gene hypermethylation profile of human cancer. Cancer Res 2001, 61: 3225-3229. 28. Reik W, Walter J: Genomic imprinting: parental influence on the genome. Nat Rev Genet 2001, 2: 21-32. 29. Li E, Beard C, Jaenisch R: Role for DNA methylation in genomic imprinting. Nature 1993, 366: 362-365. 30. Paulsen M, El-Maarri O, Engemann S, Strodicke M, Franck O, Davies K et al.: Sequence conservation and variability of imprinting in the Beckwith-Wiedemann syndrome gene cluster in human and mouse. Hum Mol Genet 2000, 9: 1829-1841. 31. Cui H, Cruz-Correa M, Giardiello FM, Hutcheon DF, Kafonek DR, Brandenburg S et al.: Loss of IGF2 imprinting: a potential marker of colorectal cancer risk. Science 2003, 299: 1753-1755. 32. Thorvaldsen JL, Duran KL, Bartolomei MS: Deletion of the H19 differentially methylated domain results in loss of imprinted expression of H19 and Igf2. Genes Dev 1998, 12: 3693-3702. 33. van dB, I, Laven JS, Stevens M, Jonkers I, Galjaard RJ, Gribnau J et al.: X chromosome inactivation is initiated in human preimplantation embryos. Am J Hum Genet 2009, 84: 771-779. 34. Csankovszki G, Nagy A, Jaenisch R: Synergism of Xist RNA, DNA methylation, and histone hypoacetylation in maintaining X chromosome inactivation. J Cell Biol 2001, 153: 773-784. 83 Reference list 35. Mohandas T, Sparkes RS, Shapiro LJ: Reactivation of an inactive human X chromosome: evidence for X inactivation by DNA methylation. Science 1981, 211: 393-396. 36. Xu GL, Bestor TH, Bourc'his D, Hsieh CL, Tommerup N, Bugge M et al.: Chromosome instability and immunodeficiency syndrome caused by mutations in a DNA methyltransferase gene. Nature 1999, 402: 187-191. 37. Hansen RS, Stoger R, Wijmenga C, Stanek AM, Canfield TK, Luo P et al.: Escape from gene silencing in ICF syndrome: evidence for advanced replication time as a major determinant. Hum Mol Genet 2000, 9: 2575-2587. 38. Weisenberger DJ, Campan M, Long TI, Kim M, Woods C, Fiala E et al.: Analysis of repetitive element DNA methylation by MethyLight. Nucleic Acids Res 2005, 33: 68236836. 39. Ehrlich M: DNA methylation in cancer: too much, but also too little. Oncogene 2002, 21: 5400-5413. 40. Ting AH, McGarvey KM, Baylin SB: The cancer epigenome-components and functional correlates. Genes Dev 2006, 20: 3215-3231. 41. Jenuwein T, Allis CD: Translating the histone code. Science 2001, 293: 1074-1080. 42. Narlikar GJ, Fan HY, Kingston RE: Cooperation between complexes that regulate chromatin structure and transcription. Cell 2002, 108: 475-487. 43. Harikrishnan KN, Chow MZ, Baker EK, Pal S, Bassal S, Brasacchio D et al.: Brahma links the SWI/SNF chromatin-remodeling complex with MeCP2-dependent transcriptional silencing. Nat Genet 2005, 37: 254-264. 44. Wade PA: SWItching off methylated DNA. Nat Genet 2005, 37: 212-213. 45. Lujambio A, Esteller M: CpG island hypermethylation of tumor suppressor microRNAs in human cancer. Cell Cycle 2007, 6: 1455-1459. 46. Kent OA, Mendell JT: A small piece in the cancer puzzle: microRNAs as tumor suppressors and oncogenes. Oncogene 2006, 25: 6188-6196. 47. Yang N, Coukos G, Zhang L: MicroRNA epigenetic alterations in human cancer: one step forward in diagnosis and treatment. Int J Cancer 2008, 122: 963-968. 48. Toyota M, Suzuki H, Sasaki Y, Maruyama R, Imai K, Shinomura Y et al.: Epigenetic silencing of microRNA-34b/c and B-cell translocation gene 4 is associated with CpG island methylation in colorectal cancer. Cancer Res 2008, 68: 4123-4132. 49. Ng EK, Tsang WP, Ng SS, Jin HC, Yu J, Li JJ et al.: MicroRNA-143 targets DNA methyltransferases 3A in colorectal cancer. Br J Cancer 2009, 101: 699-706. 50. Slaby O, Svoboda M, Fabian P, Smerdova T, Knoflickova D, Bednarikova M et al.: Altered expression of miR-21, miR-31, miR-143 and miR-145 is related to clinicopathologic features of colorectal cancer. Oncology 2007, 72: 397-402. 51. Herceg Z: Epigenetics and cancer: towards an evaluation of the impact of environmental and dietary factors. Mutagenesis 2007, 22: 91-103. 52. Jirtle RL, Skinner MK: Environmental epigenomics and disease susceptibility. Nat Rev Genet 2007, 8: 253-262. 84 Reference list 53. Mathers JC: Folate intake and bowel cancer risk. Genes Nutr 2009, 4: 173-178. 54. Steegers-Theunissen RP, Obermann-Borst SA, Kremer D, Lindemans J, Siebel C, Steegers EA et al.: Periconceptional maternal folic acid use of 400 microg per day is related to increased methylation of the IGF2 gene in the very young child. PLoS One 2009, 4: e7845. 55. Christensen BC, Houseman EA, Marsit CJ, Zheng S, Wrensch MR, Wiemels JL et al.: Aging and environmental exposures alter tissue-specific DNA methylation dependent upon CpG island context. PLoS Genet 2009, 5: e1000602. 56. Kaminsky ZA, Tang T, Wang SC, Ptak C, Oh GH, Wong AH et al.: DNA methylation profiles in monozygotic and dizygotic twins. Nat Genet 2009, 41: 240-245. 57. Fraga MF, Ballestar E, Paz MF, Ropero S, Setien F, Ballestar ML et al.: Epigenetic differences arise during the lifetime of monozygotic twins. Proc Natl Acad Sci U S A 2005, 102: 10604-10609. 58. American Cancer Society: Global Cancer- Facts and Figures. 2008. 59. Cancer Registry of Norway: Cancer in Norway-Cancer Incidence,Mortality,Survival and Prevalence in Norway. 2008. 60. Harriss DJ, Atkinson G, George K, Cable NT, Reilly T, Haboubi N et al.: Lifestyle factors and colorectal cancer risk (1): systematic review and meta-analysis of associations with body mass index. Colorectal Dis 2009, 11: 547-563. 61. Al-Sukhni W, Aronson M, Gallinger S: Hereditary colorectal cancer syndromes: familial adenomatous polyposis and lynch syndrome. Surg Clin North Am 2008, 88: 819-44, vii. 62. Umar A, Risinger JI, Hawk ET, Barrett JC: Testing guidelines for hereditary nonpolyposis colorectal cancer. Nat Rev Cancer 2004, 4: 153-158. 63. Muleris M, Chalastanis A, Meyer N, Lae M, Dutrillaux B, Sastre-Garau X et al.: Chromosomal instability in near-diploid colorectal cancer: a link between numbers and structure. PLoS One 2008, 20;3: e1632. 64. Grady WM, Carethers JM: Genomic and epigenetic instability in colorectal cancer pathogenesis. Gastroenterology 2008, 135: 1079-1099. 65. de la Chapelle, A: Microsatellite Instability. N Engl J Med 2003. 66. Wong JJ, Hawkins NJ, Ward RL: Colorectal cancer: a model for epigenetic tumorigenesis. Gut 2007, 56: 140-148. 67. Toyota M, Ahuja N, Ohe-Toyota M, Herman JG, Baylin SB, Issa JP: CpG island methylator phenotype in colorectal cancer. Proc Natl Acad Sci U S A 1999, 20;96: 8681-8686. 68. Weisenberger DJ, Siegmund KD, Campan M, Young J, Long TI, Faasse MA et al.: CpG island methylator phenotype underlies sporadic microsatellite instability and is tightly associated with BRAF mutation in colorectal cancer. Nat Genet 2006, 38: 787793. 69. Shen L, Toyota M, Kondo Y, Lin E, Zhang L, Guo Y et al.: Integrated genetic and epigenetic analysis identifies three different subclasses of colon cancer. Proc Natl Acad Sci U S A 2007, 20;104: 18654-18659. 70. Ponz de LM, Di GC: Pathology of colorectal cancer. Dig Liver Dis 2001, 33: 372-388. 85 Reference list 71. Kinzler KW, Vogelstein B: Lessons from hereditary colorectal cancer. Cell 1996, 87: 159-170. 72. Carr NJ, Mahajan H, Tan KL, Hawkins NJ, Ward RL: Serrated and non-serrated polyps of the colorectum: their prevalence in an unselected case series and correlation of BRAF mutation analysis with the diagnosis of sessile serrated adenoma. J Clin Pathol 2009, 62: 516-518. 73. Snover DC, Jass JR, Fenoglio-Preiser C, Batts KP: Serrated polyps of the large intestine: a morphologic and molecular review of an evolving concept. Am J Clin Pathol 2005, 124: 380-391. 74. Jass JR: Colorectal cancer: a multipathway disease. Crit Rev Oncog 2006, 12: 273287. 75. Dukes CE: The classification of cancer of the rectum. The Journal of Pathology and Bacteriology 1932, 35: 323-333. 76. Lind GE, Kleivi K, Meling GI, Teixeira MR, Thiis-Evensen E, Rognum TO et al.: ADAMTS1, CRABP1, and NR3C1 identified as epigenetically deregulated genes in colorectal tumorigenesis. Cell Oncol 2006, 28: 259-272. 77. Esteller M, Garcia-Foncillas J, Andion E, Goodman SN, Hidalgo OF, Vanaclocha V et al.: Inactivation of the DNA-repair gene MGMT and the clinical response of gliomas to alkylating agents. N Engl J Med 2000, 343: 1350-1354. 78. Hegi ME, Liu L, Herman JG, Stupp R, Wick W, Weller M et al.: Correlation of O6methylguanine methyltransferase (MGMT) promoter methylation with clinical outcomes in glioblastoma and clinical strategies to modulate MGMT activity. J Clin Oncol 2008, 26: 4189-4199. 79. Ramaswamy S, Ross KN, Lander ES, Golub TR: A molecular signature of metastasis in primary solid tumors. Nat Genet 2003, 33: 49-54. 80. Chang JC, Wooten EC, Tsimelzon A, Hilsenbeck SG, Gutierrez MC, Elledge R et al.: Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. Lancet 2003, 362: 362-369. 81. Sorlie T, Wang Y, Xiao C, Johnsen H, Naume B, Samaha RR et al.: Distinct molecular mechanisms underlying clinically relevant subtypes of breast cancer: gene expression analyses across three different platforms. BMC Genomics 2006, 7:127.: 127. 82. Brinkman BM: Splice variants as cancer biomarkers. Clin Biochem 2004, 37: 584-594. 83. Venables JP: Unbalanced alternative splicing and its significance in cancer. Bioessays 2006, 28: 378-386. 84. Rajan P, Elliott DJ, Robson CN, Leung HY: Alternative splicing and biological heterogeneity in prostate cancer. Nat Rev Urol 2009, 6: 454-460. 85. Varambally S, Laxman B, Mehra R, Cao Q, Dhanasekaran SM, Tomlins SA et al.: Golgi protein GOLM1 is a tissue and urine biomarker of prostate cancer. Neoplasia 2008, 10: 1285-1294. 86. Rifai N, Gillette MA, Carr SA: Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nat Biotechnol 2006, 24: 971-983. 86 Reference list 87. Meling GI, Lothe RA, Borresen AL, Hauge S, Graue C, Clausen OP et al.: Genetic alterations within the retinoblastoma locus in colorectal carcinomas. Relation to DNA ploidy pattern studied by flow cytometric analysis. Br J Cancer 1991, 64: 475480. 88. Kunkel LM, Smith KD, Boyer SH, Borgaonkar DS, Wachtel SS, Miller OJ et al.: Analysis of human Y-chromosome-specific reiterated DNA in chromosome variants. Proc Natl Acad Sci U S A 1977, 74: 1245-1249. 89. Cameron EE, Bachman KE, Myohanen S, Herman JG, Baylin SB: Synergy of demethylation and histone deacetylase inhibition in the re-expression of genes silenced in cancer. Nat Genet 1999, 21: 103-107. 90. Christman JK: 5-Azacytidine and 5-aza-2'-deoxycytidine as inhibitors of DNA methylation: mechanistic studies and their implications for cancer therapy. Oncogene 2002, 21: 5483-5495. 91. Santi DV, Garrett CE, Barr PJ: On the mechanism of inhibition of DNA-cytosine methyltransferases by cytosine analogs. Cell 1983, 33: 9-10. 92. Shapiro R, Weisgras JM: Bisulfite-catalyzed transamination of cytosine and cytidine. Biochem Biophys Res Commun 1970, 40: 839-843. 93. Frommer M, McDonald LE, Millar DS, Collis CM, Watt F, Grigg GW et al.: A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc Natl Acad Sci U S A 1992, 89: 1827-1831. 94. Clark SJ, Harrison J, Paul CL, Frommer M: High sensitivity mapping of methylated cytosines. Nucleic Acids Res 1994, 22: 2990-2997. 95. Esteller M: DNA methylation. Approaches, Methods, and Applications. 2005. 96. Herman JG, Graff JR, Myohanen S, Nelkin BD, Baylin SB: Methylation-specific PCR: a novel PCR assay for methylation status of CpG islands. Proc Natl Acad Sci U S A 1996, 93: 9821-9826. 97. Derks S, Lentjes MH, Hellebrekers DM, de Bruine AP, Herman JG, van EM: Methylationspecific PCR unraveled. Cell Oncol 2004, 26: 291-299. 98. Heid CA, Stevens J, Livak KJ, Williams PM: Real time quantitative PCR. Genome Res 1996, 6: 986-994. 99. Kubista M, Andrade JM, Bengtsson M, Forootan A, Jonak J, Lind K et al.: The real-time polymerase chain reaction. Mol Aspects Med 2006, 27: 95-125. 100. Atkinson MR, Deutscher MP, Kornberg A, Russell AF, Moffatt JG: Enzymatic synthesis of deoxyribonucleic acid. XXXIV. Termination of chain growth by a 2',3'dideoxyribonucleotide. Biochemistry 1969, 8: 4897-4904. 101. Sanger F, Nicklen S, Coulson AR: DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A 1977, 74: 5463-5467. 102. Lind GE, Ahlquist T, Lothe RA: DNA hypermethylation of MAL: a promising diagnostic biomarker for colorectal tumors. Gastroenterology 2007, 132: 1631-1632. 103. Lind GE, Ahlquist T, Kolberg M, Berg M, Eknaes M, Alonso MA et al.: Hypermethylated MAL gene - a silent marker of early colon tumorigenesis. J Transl Med 2008, 6:13.: 13. 87 Reference list 104. Mori Y, Cai K, Cheng Y, Wang S, Paun B, Hamilton JP et al.: A genome-wide search identifies epigenetic silencing of somatostatin, tachykinin-1, and 5 other genes in colon cancer. Gastroenterology 2006, 131: 797-808. 105. Paz MF, Fraga MF, Avila S, Guo M, Pollan M, Herman JG et al.: A systematic profile of DNA methylation in human cancer cell lines. Cancer Res 2003, 63: 1114-1121. 106. Li L: Establishment of tumour cell lines by transient expression of immortalizing genes. Gene Therapy & Molecular Biology 1999, 4: 261-274. 107. Ueki T, Walter KM, Skinner H, Jaffee E, Hruban RH, Goggins M: Aberrant CpG island methylation in cancer cell lines arises in the primary cancers from which they were derived. Oncogene 2002, 21: 2114-2117. 108. Smiraglia DJ, Rush LJ, Fruhwald MC, Dai Z, Held WA, Costello JF et al.: Excessive CpG island hypermethylation in cancer cell lines versus primary human malignancies. Hum Mol Genet 2001, 10: 1413-1419. 109. Murai M, Toyota M, Suzuki H, Satoh A, Sasaki Y, Akino K et al.: Aberrant methylation and silencing of the BNIP3 gene in colorectal and gastric cancer. Clin Cancer Res 2005, 11: 1021-1027. 110. Lee H, Paik SG: Regulation of BNIP3 in normal and cancer cells. Mol Cells 2006, 21: 16. 111. Kainz B, Shehata M, Bilban M, Kienle D, Heintel D, Kromer-Holzinger E et al.: Overexpression of the paternally expressed gene 10 (PEG10) from the imprinted locus on chromosome 7q21 in high-risk B-cell chronic lymphocytic leukemia. Int J Cancer 2007, 121: 1984-1993. 112. Li CM, Margolin AA, Salas M, Memeo L, Mansukhani M, Hibshoosh H et al.: PEG10 is a cMYC target gene in cancer cells. Cancer Res 2006, 66: 665-672. 113. Roman-Gomez J, Jimenez-Velasco A, Agirre X, Castillejo JA, Navarro G, San Jose-Eneriz E et al.: Epigenetic regulation of human cancer/testis antigen gene, HAGE, in chronic myeloid leukemia. Haematologica 2007, 92: 153-162. 114. Kanno J, Hutchin T, Kamada F, Narisawa A, Aoki Y, Matsubara Y et al.: Genomic deletion within GLDC is a major cause of non-ketotic hyperglycinaemia. J Med Genet 2007, 44: e69. 115. Amin K, Li J, Chao WR, Dewhirst MW, Haroon ZA: Dietary glycine inhibits angiogenesis during wound healing and tumor growth. Cancer Biol Ther 2003, 2: 173-178. 116. Chan AO, Broaddus RR, Houlihan PS, Issa JP, Hamilton SR, Rashid A: CpG island methylation in aberrant crypt foci of the colorectum. Am J Pathol 2002, 160: 18231830. 117. Robertson KD, Jones PA: DNA methylation: past, present and future directions. Carcinogenesis 2000, 21: 461-467. 118. Sawan C, Vaissiere T, Murr R, Herceg Z: Epigenetic drivers and genetic passengers on the road to cancer. Mutat Res 2008, 642: 1-13. 119. Burch JA: Diagnostic accuracy of faecal occult blood tests used in screening for colorectal cancer:a systematic review. Journal of Medical Screening 2007, 14: 132-137. 120. Levin B, Lieberman DA, McFarland B, Andrews KS, Brooks D, Bond J et al.: Screening and surveillance for the early detection of colorectal cancer and adenomatous 88 Reference list polyps, 2008: a joint guideline from the American Cancer Society, the US MultiSociety Task Force on Colorectal Cancer, and the American College of Radiology. Gastroenterology 2008, 134: 1570-1595. 121. Sidransky D: Emerging molecular markers of cancer. Nat Rev Cancer 2002, 2: 210219. 122. Muller HM, Oberwalder M, Fiegl H, Morandell M, Goebel G, Zitt M et al.: Methylation changes in faecal DNA: a marker for colorectal cancer screening? Lancet 2004, 363: 1283-1285. 123. Glockner SC, Dhir M, Yi JM, McGarvey KE, Van NL, Louwagie J et al.: Methylation of TFPI2 in stool DNA: a potential novel biomarker for the detection of colorectal cancer. Cancer Res 2009, 69: 4691-4699. 124. Anglim PP, Alonzo TA, Laird-Offringa IA: DNA methylation-based biomarkers for early detection of non-small cell lung cancer: an update. Mol Cancer 2008, 7:81.: 81. 125. Grutzmann R, Molnar B, Pilarsky C, Habermann JK, Schlag PM, Saeger HD et al.: Sensitive detection of colorectal cancer in peripheral blood by septin 9 DNA methylation assay. PLoS One 2008, 3: e3759. 126. Ng EK, Chong WW, Jin H, Lam EK, Shin VY, Yu J et al.: Differential expression of microRNAs in plasma of patients with colorectal cancer: a potential marker for colorectal cancer screening. Gut 2009, 58: 1375-1381. 89 Appendix I - Tumour samples COLORECTAL CARCINOMAS Sample MSI status Gender Age Localisation Duke´s stage Differentiation BRAF KRAS PIK3CA PTEN TP53 GIM_c884 MSI Female 90 Right B Medium Mutated Wild-type Wild-type Mutated Wild-type GIM_c887 MSS Female 82 Rectum B High Wild-type Mutated Mutated Wild-type Wild-type GIM_c894I MSI Male 80 Right B Medium Wild-type Mutated Wild-type Wild-type Wild-type GIM_c896 MSS Female 71 Rectum C Medium Wild-type Wild-type Wild-type Wild-type Mutated GIM_c910 MSI Female 65 Right B High Wild-type Wild-type Wild-type Wild-type Wild-type GIM_c912I MSI Female 66 Left B Low Wild-type Mutated Mutated Wild-type Wild-type GIM_c946 MSS Male 77 Left B Medium Wild-type Wild-type Wild-type Wild-type Wild-type GIM_c955 MSI Female 84 Right B Medium Mutated Wild-type Wild-type Wild-type Wild-type GIM_c1013 MSS Female 66 Rectum B Medium Mutated Wild-type Wild-type Wild-type Mutated GIM_c1044II MSI Female 63 Rectum A Medium Mutated Wild-type Wild-type Mutated Wild-type GIM_c1045 MSS Female 62 Left A Medium Wild-type - - - Mutated GIM_c1047III MSI Male 70 Rectum C Medium Wild-type Mutated Wild-type Wild-type Wild-type GIM_c1117I MSI Male 78 Right C Medium Wild-type Wild-type Wild-type Wild-type Wild-type GIM_c1121 MSS Male 71 Left B Medium Wild-type Mutated Mutated Wild-type Mutated GIM_c1141II MSI Female 76 Right D Medium Wild-type Wild-type Wild-type Mutated Wild-type GIM_c1166 MSS Male 77 Left B Medium Wild-type Mutated Wild-type Wild-type Wild-type GIM_c1167 MSS Male 73 Left C Medium Wild-type Wild-type Wild-type Wild-type Mutated GIM_c1193 MSI Female 69 Right C Low Mutated Wild-type Mutated Wild-type Wild-type GIM_c1194 MSS Male 44 Left C Medium Wild-type Wild-type Wild-type Wild-type Mutated GIM_c1268III MSI Male 71 Right B Low Mutated Wild-type Mutated Mutated Wild-type GIM_c1341I MSI Female 89 Right B Medium Mutated Wild-type Wild-type Mutated Wild-type 90 Appendix I GIM_c1342 MSI Male 49 Right B Medium Wild-type Wild-type Wild-type Wild-type Mutated GIM_c1363 MSI Male 70 Right A Medium Wild-type Mutated Wild-type Wild-type Wild-type GIM_c1369 MSS Female 82 Left B Low Wild-type Mutated - - Wild-type AUS_001 MSS Female 71 Left A Medium Wild-type Mutated Wild-type Wild-type Mutated AUS_003 MSS Female 79 Right B Medium Wild-type Mutated Wild-type Wild-type Wild-type AUS_006 MSS Male 62 Right A Medium Wild-type Mutated Wild-type Wild-type Mutated AUS_007 MSS Female 87 Rectum A Medium Wild-type Mutated Wild-type Wild-type Mutated AUS_008 MSS Female 39 Left A Medium Wild-type Wild-type Wild-type Wild-type Mutated AUS_011 MSS Female 67 Left C Medium Wild-type Wild-type Wild-type Wild-type Mutated AUS_012 MSI Female 84 Right B Low Mutated Wild-type Wild-type Mutated Wild-type AUS_015 MSI Female 66 Right C Low Mutated Wild-type Wild-type Wild-type Wild-type AUS_017 MSS Female 73 Rectum B Medium Wild-type Wild-type Wild-type Wild-type Mutated AUS_018 MSS Male 78 Left C Medium Wild-type Wild-type Wild-type Wild-type Mutated AUS_019 MSI Male 71 Right C Low Wild-type Wild-type Wild-type Wild-type Wild-type AUS_020 MSS Male 42 Right D Medium Mutated Wild-type Wild-type Wild-type Mutated AUS_021 MSS Male 77 Left A Medium Wild-type Wild-type Wild-type Wild-type Mutated AUS_024 MSS Male 77 Right C Medium Wild-type Mutated Wild-type Wild-type Mutated AUS_025 MSS Male 71 Left A Medium Wild-type Wild-type Wild-type Wild-type Mutated AUS_026 MSS Female 62 Transversum B Medium Wild-type Wild-type Wild-type Wild-type Mutated AUS_030 MSI Female 84 Right B High Mutated Wild-type Wild-type Wild-type Wild-type AUS_032 MSI Male 69 - A High Mutated Wild-type Wild-type Wild-type Wild-type AUS_035 MSS Female 81 Left A Medium Wild-type Wild-type Wild-type Wild-type Mutated AUS_036 MSS Male 35 Rectum C Medium Wild-type Wild-type Wild-type Wild-type Mutated AUS_037 MSI Male 69 Transversum B Medium Wild-type Wild-type Mutated Wild-type Wild-type AUS_040 MSS Male 69 Left D Medium Wild-type Wild-type Wild-type Wild-type Wild-type AUS_111 MSS Male 64 Rectum C Medium Wild-type Wild-type Wild-type Wild-type Wild-type 91 Appendix II – Normal tissue samples Normal mucosa Sample Gender Age Localisation N1 N2 N3 N9 N11 N12 N13 N14 N15 N16 N17 N18 N19 N20 N21 N22 N24 N26 N27 N28 N29 N33 N36 N37 N38 N42 N47 N49 N50 N51 N52 N53 N55 N57 N58 N59 N61 N62 N63 N64 N65 N66 N67 N68 N69 N70 N71 Male Male Male Female Female Male Male Male Female Male Female Female Female Female Male Female Male Male Male Female Male Female Male Male Female Female Female Female Male Male Male Female Female Male Male Female Male Male Male Male Male Male Female Male Male Male Male 44 54 33 63 40 40 38 38 82 54 60 75 39 48 54 86 40 74 51 79 53 76 85 38 22 46 62 73 61 43 62 69 78 81 38 79 55 68 54 28 62 60 55 72 64 85 57 Distal Proximal Distal Proximal Rectum Coecum Rectum Coecum Rectum Proximal Distal Coecum Rectum Coecum Rectum Coecum Rectum Rectum Rectum Rectum Proximal Distal Distal Proximal Distal Proximal Proximal Proximal Coecum Coecum Coecum Coecum Coecum Coecum Rectum Rectum Rectum Rectum Rectum Rectum Rectum Coecum Coecum Coecum Coecum Rectum Rectum 92 Appendix II N72 N73 Male Male 66 42 Coecum Rectum 93 Appendix III – Qualitative MSP analyses Primer information Primer set Forward primer Reverse primer Fragment size (base pairs) Annealing temperature (°C) Annealing and elongation time (sec) Accession number NM_004052 BNIP3-MSP-M ACGCGTCGTACGTGTTATAC ACTACGCTCCCGAACTAAAC 158 52 30 BNIP3-MSP-U GTATGTGTTGTATGTGTTATAT ACTACACTCCCAAACTAAACAA 158 52 30 CBS-MSP-M GTTACGAGATATTGGTCGGC CTACGACGAAACGAAAACG 127 48 60 CBS-MSP-U TTTGTTATGAGATATTGGTTGGT ACTACAACAAAACAAAAACAAC 127 48 60 DDX43-MSP-M GGCGTTTGGAAAAAGTTTTAC CCAATCGATTTTCTAAACCG 110 50 60 DDX43-MSP-U TTGGGTGTTTGGAAAAAGTTTTAT TAACCAATCAATTTTCTAAACCA 110 50 60 GLDC-MSP-M CGTCGTTTAAAGTGTGC CAATCGACCGAACAAATAAA 122 48 60 GLDC-MSP-U AGGGTGTTGTTTAAAGTGTGT AAACAATCAACCAAACAAATAAA 122 48 60 GLDC BS GGGTAGGATTGGAGATGGTAGT CTCTTAACCCCTCTCCTAACCTC 364 56 30 IQCG-MSP-M GGTAGACGGAGGGTTTAGTC CATTTATTAACCGACTTCGC 133 48 30 IQCG-MSP-U GGGGTAGATGGAGGGTTTAGTT AACATTTATTAACCAACTTCAC 133 48 30 PEG10-MSP-M GAGTACGTTGGGATTTGGC ACTCGATAAACCTTCTCCGC 152 52 30 PEG10-MSP-U GGAGTATGTTGGGATTTGGT AACTCAATAAACCTTCTCCAC 152 52 30 PPP1R14A-MSP-M TTAGAGGGCGTAGATAGGTC CTACGTCGACTTAAAACACG 157 52 30 PPP1R14A-MSP-U GTTAGAGGGTGTAGATAGGTT CTACATCAACTTAAAACACAC 157 52 30 PPP1R14A BS TTAGTTTGGGYGATAAAGAGAG CCTCAAACCTCAATTTCCC 382 56 30 RASSF4-MSP-M TTATCGGCGTTTTTAGAGC CCGACACGACCAAAAATA 113 56 60 NM_000071 NM_018665 NM_000170 NM_032263 NM_001040152 NM_033256 NM_032023 94 Appendix III RASSF4-MSP-U AATTTATTGGTGTTTTTAGAGT CCCAACACAACCAAAAATACC 113 56 60 RBP7-MSP-M TTTGGTTTATAGGTTTCGGTTC AACCCTCGAAATTATCGCTA 123 54 45 RBP7-MSP-U GTTTGGTTTATAGGTTTTGGTTT TAACCCTCAAAATTATCACTA 123 54 45 WDR21B-MSP-M ATTTTCGTTTGTATTCGGAC TCCTACGAAATATTCCTCGT 105 48 60 WDR21B-MSP-U AATATTTTTGTTTGTATTTGGAT TCCTCCTACAAAATATTCCTCAT 105 48 60 NM_052960 NM_001029955 Abbreviations: BS, bisulfite sequence; MSP, methylation-specific PCR; U, unmethylated; M, methylated. 95 Appendix IV – Quantitative MSP analyses Primer and probe information Assay Forward primer Reverse primer Probe GLDC qMSP PPP1R14A qMSP GGGCGTCGTTTAAAGTGTGC ACGAAGGAATAAGTGATCGTTCG GCGAACAATAAATAAACGCTACGC CGCCCTCTAACGATAACGAAA 6FAM-GGTGGAGTTATAATTTTGCGCGA-MGB 6FAM-GATAGCGGCGTAGGC-MGB Fragment size (base pairs) 98 94 Abbreviations: qMSP, quantitative methylation-specific PCR; MGB, minor groove binder. 96 97