BLAT Worked Examples - The MDI Biological Laboratory

Transcription

BLAT Worked Examples - The MDI Biological Laboratory
NECC First Skate Genome Annotation Workshop
BLAT Worked Examples
BLAT Worked Examples
Benjamin King
Mount Desert Island Biological Laboratory
Worked Example #1: Use BLAT to align the protein sequence for skate SHH to the human
genome.
STEP 1: Go to Entrez to get the protein sequence for skate SHH (GenPept sequence identifier
EF100667) in FASTA format.
STEP 2: Go to the BLAT page at the UCSC Genome Browser site (http://genome.ucsc.edu). On
the UCSC Genome Browser home page, you will see a link, named BLAT, on the left side of the
page. The BLAT web form will appear as shown below. By default, the human genome will be
searched. Notice how you can select another genome and then a particular version of the
assembly for that genome.
1
NECC First Skate Genome Annotation Workshop
BLAT Worked Examples
STEP 3: Paste your sequence in FASTA format from Entrez into the BLAT form. Then, click the
“submit” button to start the search.
STEP 4: On the next page, you will see a table of the hits from BLAT. Notice how there are hits
to Chr. 7, 2 and 12. Why do you think there are multiple hits? The “details” link will show you the
pairwise alignment of the back-translated protein sequence to the human genome. Click on the
“details” link (open a new window) to view the pairwise alignment. Scroll down in the resulting
page.
2
NECC First Skate Genome Annotation Workshop
BLAT Worked Examples
Here is part of the pairwise alignment.
STEP 5: Next, click the “browser” link for the first hit to view the region of the human genome (in
this case Chr. 7) where the skate SHH protein aligns. As we expect, the skate SHH aligns where
the human SHH gene is located.
3
NECC First Skate Genome Annotation Workshop
BLAT Worked Examples
Worked Example #2: Use BLAT to align the human RefSeq transcript sequence for SHH to the
human genome to examine gene structure.
STEP 1: Go to Entrez to get the transcript sequence for human SHH (RefSeq sequence identifier
NM_000193) in FASTA format.
STEP 2: Go to the BLAT query form at the UCSC Genome Browser and paste the FASTA
formatted sequence into the form as we did in the previous example (and shown below).
4
NECC First Skate Genome Annotation Workshop
BLAT Worked Examples
STEP 3: On the resulting page, examine the pairwise alignment (by opening a new page) for
the first hit to make sure it appears to be significant by clicking on the “details” link.
The pairwise alignment should appear as follows. By scrolling through the alignment, we see that
the entire query sequence aligns to three different exons (blocks) as we would expect. We can
also see that the splice site consensus sequences are correct.
5
NECC First Skate Genome Annotation Workshop
BLAT Worked Examples
STEP 4: Finally, view the alignment along human Chr. 7 by clicking on the “browser” link back on
the results table.
Additional Exercise:
1. Map the following SNP, rs13483726, to the mouse genome using BLAT. You can obtain
the flanking SNP sequence from Entrez. Then, use BLAT (http://genome.ucsc.edu) to
map the sequence (and therefore the SNP).
6

Similar documents