BLAT Worked Examples - The MDI Biological Laboratory
Transcription
BLAT Worked Examples - The MDI Biological Laboratory
NECC First Skate Genome Annotation Workshop BLAT Worked Examples BLAT Worked Examples Benjamin King Mount Desert Island Biological Laboratory Worked Example #1: Use BLAT to align the protein sequence for skate SHH to the human genome. STEP 1: Go to Entrez to get the protein sequence for skate SHH (GenPept sequence identifier EF100667) in FASTA format. STEP 2: Go to the BLAT page at the UCSC Genome Browser site (http://genome.ucsc.edu). On the UCSC Genome Browser home page, you will see a link, named BLAT, on the left side of the page. The BLAT web form will appear as shown below. By default, the human genome will be searched. Notice how you can select another genome and then a particular version of the assembly for that genome. 1 NECC First Skate Genome Annotation Workshop BLAT Worked Examples STEP 3: Paste your sequence in FASTA format from Entrez into the BLAT form. Then, click the “submit” button to start the search. STEP 4: On the next page, you will see a table of the hits from BLAT. Notice how there are hits to Chr. 7, 2 and 12. Why do you think there are multiple hits? The “details” link will show you the pairwise alignment of the back-translated protein sequence to the human genome. Click on the “details” link (open a new window) to view the pairwise alignment. Scroll down in the resulting page. 2 NECC First Skate Genome Annotation Workshop BLAT Worked Examples Here is part of the pairwise alignment. STEP 5: Next, click the “browser” link for the first hit to view the region of the human genome (in this case Chr. 7) where the skate SHH protein aligns. As we expect, the skate SHH aligns where the human SHH gene is located. 3 NECC First Skate Genome Annotation Workshop BLAT Worked Examples Worked Example #2: Use BLAT to align the human RefSeq transcript sequence for SHH to the human genome to examine gene structure. STEP 1: Go to Entrez to get the transcript sequence for human SHH (RefSeq sequence identifier NM_000193) in FASTA format. STEP 2: Go to the BLAT query form at the UCSC Genome Browser and paste the FASTA formatted sequence into the form as we did in the previous example (and shown below). 4 NECC First Skate Genome Annotation Workshop BLAT Worked Examples STEP 3: On the resulting page, examine the pairwise alignment (by opening a new page) for the first hit to make sure it appears to be significant by clicking on the “details” link. The pairwise alignment should appear as follows. By scrolling through the alignment, we see that the entire query sequence aligns to three different exons (blocks) as we would expect. We can also see that the splice site consensus sequences are correct. 5 NECC First Skate Genome Annotation Workshop BLAT Worked Examples STEP 4: Finally, view the alignment along human Chr. 7 by clicking on the “browser” link back on the results table. Additional Exercise: 1. Map the following SNP, rs13483726, to the mouse genome using BLAT. You can obtain the flanking SNP sequence from Entrez. Then, use BLAT (http://genome.ucsc.edu) to map the sequence (and therefore the SNP). 6