Biology 4100 Minor Assignment 1

Comments

Transcription

Biology 4100 Minor Assignment 1
Biology 3200 Minor Assignment 2
This assignment is due in class on April 2, 2009. It is worth the same as 2 quizzes. The
assignment consists of the four questions embedded in the following text below. Your assignment
must be typed on white paper. The type font must be no less than 12 point and lines must be
double-spaced.
A new biotechnology company called Brent Biotechnica has hired you. The company is
interested in developing a line of commercial enzymes for a number of applications in the livestock,
food and beverage, pulp and paper, pharmaceutical, and textiles industries. The company is bioprospecting in a number of microbial ecosystems, including the rumen, soil and hot springs, for novel
genetic material. You have been assigned to the R&D group responsible for expressing newly cloned
genes. Your first project is to aid in the cloning of a phosphatase gene from Megasphera elsdenii. A
region of M. elsdenii genomic DNA containing a putative phosphatase gene has been amplified by
PCR. The resulting fragment contains more than just the phosphatase gene.
You have been given a nucleotide sequence file (Appendix 1) for the 5183 bp PCR product
(i.e., a double stranded linear DNA fragment) amplified from M. elsdenii genomic DNA. You have
been asked to work with a research team to prepare a protocol for cloning ONLY the phosphatase
coding sequence (CDS) into pUC19 (refer to Appendix 2 for a schematic representation of pUC19).
Your protocol must include the following information.
1) Basic Local Alignment Search Tool (Blast) analyses. Blast analysis finds regions of similarity
between your query and database entries. In this exercise, you will use the BLAST feature of the NCBI
site (www.ncbi.nlm.nih.gov) to identify the likely coding sequences in the 5183 bp PCR product from
M. elsdenii. There are a number of search options available. For the purposes of this exercise you will
perform a translated query - protein database search (blastx). In the blastx search, one submits
nucleotide sequence data and the program translates the sequence into all six reading frames and
compares the resulting amino acid sequences to the protein database. Copy and paste the nucleotide
sequence from Appendix 1 into the Enter Query Sequence box of the blastx server. Note: Be sure you
use the “nr” database option in your searches.
a) The blastx search will find a number of potential coding sequences (i.e., blast hits) on the 5183 bp
PCR product from M. elsdenii and illustrates their approximate location on the fragment in the Graphic
Summary window. How many different potential coding sequences has the blastx program identified
and what are their potential identities (i.e., what these potential coding regions may encode)? This
information can be found in the Graphic Summary window, as well as the Descriptions and Alignments
windows below the Graphics Summary window. Move the pointer over the colored horizontal lines in
the Graphic Summary window. These lines represent the extent of the significant alignments between
the M. elsdenii sequence and different database entries. If you click on these lines you are taken to the
entry in the Alignments window. Notice the other types of information that are found in the
Descriptions and Alignments windows. Prepare a table listing the following information: possible
identity of the potential coding sequences, approximate location and reading frame on the M. elsdenii
fragment, and significance (i.e., E value) of the best match for each potential coding region on the M.
elsdenii fragment. Also, be sure to highlight the phosphatase sequence.
(10 marks)
Biol 3200 Assignment 2, March 18, 2009
2) Identify the exact location of the phosphatase coding sequence [i.e., open reading frame (ORF)] on
the PCR product. You can use the ORF finder program for this task. The link for this program can be
found under the Hot Spots list on the right hand side of the NCBI homepage. You can also take
advantage of the information gathered during your blastx search to assist you in this task. Once you are
at the ORF finder data entry page you can paste the M. elsdenii sequence into the data entry window
(i.e. the box below “or sequence in FASTA format”). Select the OrfFind button. The resulting window
shows all six reading frames (from top to bottom the reading frames are 1, 2, 3, -1, -2, -3) and the
colored boxes represent the ORFs in each reading frame of the M. elsdenii sequence. If you click on an
ORF, its putative amino acid sequence will appear below the reading frames. In your answer you
should include a brief description and justification of your approach to identify the phosphatase ORF
and its exact location and orientation on the PCR product.
(20 marks)
3) Your final task is to generate a restriction map using NEBcutter (http://www.neb.com) to identify
the best restriction enzymes to clone ONLY the phosphatase ORF into pUC19. Prepare a restriction
map of the M. elsdenii sequence using only the restriction sites found in the pUC19 MCS. Include a
copy of the restriction map in your report and identify the best restriction enzymes to use for cloning
only the phosphatase ORF from M. elsdenii into pUC19. Your answer should include a brief
justification of your approach.
(20 marks)
NOTE: The sites described above contain help and FAQ sections.
Appendix 1. Nucleotide Sequence Data for the M. elsdenii 5183 bp PCR product
GTCATGGGCA
GAACAGNGCG
GCCCGACGCA
CATATCCAAG
AAGAATACAA
CGGYAGCTCC
CTCKCACATC
GGTAACGTCC
CGCCGATATC
CGACGAAAAC
CCTCTTGCTC
TGTAAAACAG
GCCGCCGGGA
CCTGGTCGTC
GTCGGAAATG
TCCTCACTGC
GAAATACGAT
CGATAAAGGT
GCAAGGCGTG
GTATTAAGTT
CAAACTACAA
TGACAGCTCT
CGACCAGCTT
AGCCACAAGC
TGCCAAAGAA
CGGCGTTGAG
TGGCCGGGAC
CGACAGTCGA
TAACGGTCTG
TTGACGAATA
CCGCTCAGAN
AACTGGCCGA
TTTGTGAACG
AGGAGAATGA
TGCGCTGAAC
AAGCATGTCA
CTTATGGCTG
ACGGGCCCGT
GGTATCTATC
GAAGACGCTG
TTCTGGACTG
ACGGGTGACG
ACGTCGCCGC
ATGCACGTCC
CACGAAAAAT
ATTCCCCATA
GACATCGAAG
TTAGGGAAAT
TGCAGTTCGT
ACTCCATACT
CCTATAAGGA
GCGGGCCGCC
TTCTAATCAT
GACTTCTGTG
CGGTACGTCG
GAGTTCCCAG
TACCTGATGG
GGAAACAGTC
CGAAGTCATC
CAATGTGGCG
TGCCCTCATC
TTCCGGTACG
TCCCAATGAG
GCACCCAGCC
TCGCTGTTGT
TTGAAATGCA
CGATTCCTAA
CGGTTACGAC
CCGCTCCGGT
ACGTCATCTG
TTCCGCTGAC
AGGAACTCGT
CAGTCCTGGG
TAGCCATCTT
CGGCAAAACT
GCTACGAAGC
AATAAAAAGG
AGTTTTTAGT
ACAAACTATA
GAGCCACTGC
CTTTGGGACG
TTTACATTCT
TTATCCATGA
TCAGCAGCAT
GGATCGCCAT
TATTCGGTCT
GGCAGCGTCG
CGCCTCATTG
CGCACGACGG
CATGGCAAAC
TGCTGTCATG
TGAAAATTGT
CCAGAGCCTG
CAGCGGCAAA
GCGCCGCGGN
GGTCTTCGGC
CAAGACGGGC
CGTATGGCGC
GGGCGACATC
AGTCTTCCAG
TTCTATGATT
CCTCATCGAA
CGGTCAGAGC
ACCGATCGAC
CAGCTACATG
GCTTGTCGCA
GAAGTTTTTT
TTGAAAGAAC
GGGCCGCCCA
GCCCGTACGA
TCGTGGCGTC
CGCCCTGGAA
GACATTCGTC
CGTGATGGCG
TATCGCCAGC
GTTTCTGCTG
2
ATTGGGTCGG
TAACCGGCAT
GTCTCGTCAT
CCTGCTGTCA
ACACACGATT
CARGCTGCCC
GGCGGCGTCG
TTCAAGACGG
CTGACGGACC
ATCAAAGTCA
GGCCCGGTCA
GACTACATGT
TCCCTGCCCA
GTCGAAAAAG
AACATGAGCT
CATGTCGAAG
CCGGAATTTG
GCGGACACAG
CGACGACAAG
AAATTTAAAG
AACCTCTCAG
TCCTCTGATG
GCCACTAGCC
GATGCCGAGG
GTATTCCGAG
ACCGATGGTA
GGCCGGATTT
CTGCTGAGCT
GAAGTGGAAG
CCTGACGCAG
TTATGATTCG
CGAAGGCGGC
CGCCGGAAAA
GTTCCAGCTG
CCCATGCCGG
GTAAATCCCT
CTATCCTCGA
ATGCGACAGG
TGTCCATGAA
TTTCCGGTGC
TCGTCGACAT
TGGACGGCAT
CATTGAACAT
ATTTCGAATG
AATCGGCACA
CATCCCATTG
CCTCCTTCCT
CCCTTTTTGA
CTAGCTTCTA
TCAGCTTCGC
CCCGTTTTGG
ACTAGCCACT
GCTCTCATGA
CCGGGACCTT
TCGGGGATAT
GCTTTTTCGA
TCCTGTTTGG
TTTTCGTAAT
Biol 3200 Assignment 2, March 18, 2009
ATTCCGGATG
ACCCGTCGTG
CGGCCAGGCC
CGTGGTCGGC
CATATTCGAT
CATGGAGCTG
TGCTAAGGAT
TACCGAGGAT
GATGGAACAG
TGACGAAGGT
TCGAACCCTG
GGCGATGATA
TATGACCGAC
CGGACATGCC
AGGGATCTTT
TCGCATAAGC
TAATCAGGCA
TCATGCCTTT
CGAAGAGGAT
CATCGGCGGA
TTTTAGGGAA
CCTTTAATTC
TCTTCAAATT
ATTTCGCATT
CTGCCAGTGA
TATCATGACC
TAGAAACGGT
TGATTTATAG
CGGGCCGCTT
GCATCGTCCA
GCATCGGCCT
TGTCCATCCC
TTCCCTGTAT
CCGGTCGGGA
TCATCGAGAA
TGCGCGTCCC
CGGCACGCCA
TCCGTCAAGG
ACGGCCTGGC
TCATGGCCCA
CAAAAATGAT
TGCTCGGCGC
CGGACGATGC
TCCTTCAAGA
ATATCAGCAC
ATGGCCCCCA
AACAAGTCGG
AAGGTGACGT
TCAAAGGGCT
TCGATGTGAG
ATACCGGCCT
TCGAGGCTGT
TTGATACCCA
GGCTGGCCGC
CGCAGGACCT
CGGATCGTCG
CGGTCAAATA
TCGATGGTTG
GTTGGCATAC
GTTGGCATCT
TTTCTTGCCG
GACGACGATG
GGCCTTATCG
TTTATCGATG
GTCCAGGGAT
TTCCGGGTTC
GCCCAGGAGC
ATAGCCTTTA
TTCGGACTGC
ATCTTCGAGA
CATGGCAGCC
GCGGCTGCGT
CGTGGAATCT
AGACGCGGAG
GTTATGGGGC
GGAAATGATG
GACGTTCTTG
GTTAGCGCCG
ATTGACCTGA
CTGGCCAAAG
CTTGGCTTTC
ATCGACGGGA
GCCGCAGGCT
TCCTGAGTTT
CAAAACTATG
CTGCTTTGAG
CCGCAAGAAA
CAAGGTCTGC
CTCGACAGCC
CGGCTATACC
CGTCGAAAAT
TATGGCGTCC
CGGTCATATT
CGCCGGCCGT
AAAAGCCGCC
CGATGCATCG
CAACATCGCC
GGAAGCCATC
GCTGCCGCCG
TTTGTACCTG
GGGTAACTTT
GGCCGTCATA
CGGCCTTGAT
TTTCCATATC
TCATGATCGG
TCGAGACCAT
TCAAATCCTG
GATCCGTCTT
GTCCATAGCC
ACGACCGTTC
GTTTCTTCAA
TCTGTCCGAT
GGCCCCCTTC
TGTCTATTTT
AGATGACATT
TCCCGGCCAG
TGGACGGCCA
TTAAAGGTCG
TCCCGTTCAT
ATCAGCGTGT
AACTCGATGG
GATGCGCCTT
TTCTTGGTCA
TTGAGCATTT
GGTTTGTTGT
TCCTTAAATT
GGGATATAAC
TATTCCGAAG
GGCGTAGCGT
TTTACGATTT
GCATAGACCC
TTGGCGCTGT
ATCTTTTCCA
CGGGCCATTT
GCCTGGCGCT
TCAGCATCGG
TTGGCACGAT
AATTTATCGG
GATACTTCGA
AGGATGCGGA
ACCATAGAAA
CGTAACATAT
TATAATCTTT
AATTCCGGCA
GGCAGCCTGC
TCGCTCATCC
GGCATCGTCG
GGGACGGACT
GATACCAACT
CTCTTTGCCG
CTCCACGGTG
CTCCACGATG
GACCCGGAAA
GCTTTAGCCG
TGCCTGTGCA
CTGCGGCCCC
AAAGGCCTGG
TTACAACAAA
TTCATCGACT
GGTTTCCCGT
AGCCATGACG
GTCGGAAACG
CTTGGAGACG
GATGAACATT
TTTCAAGAGT
GACTTTGCCA
GGCAACGGCT
GGCCTGGGAG
CTCAGTCCCC
GGCCTGTTCC
CTGATCGAGG
TTTGGTCAGA
CCCGGCGTGG
GCC
GGGTGACATC
GGAAGATGGA
GATAGGTGCC
TGTCCCCGTG
TATCATACGT
CAATCATGGC
TGTTGATGAG
CCCGATCGAT
CCGGTGCGGC
CCTGGATGAC
GGCGCGAGCC
CGATGTAGTC
CTGTCGATTC
CCGAAATATT
CCAAGGCATT
CTGTCGTCAG
TAGCCAGCAA
CGCGGGCTTG
GGGCTTTTTC
TGGCCGTCGC
ATGAAATGAC
CTTTCTGTCC
CGTCGAAATC
CATCGGCCGC
GAATTGCTCG
TCGACATGGT
GTATAATCCC
CGGCCATCAA
CGACCGAAGC
ATCGCCAGCT
ATACAGAAAG
GGAAAGGCAT
GCGCCGCCCT
TTACGGTCGG
CCAGCCGGGC
TCGCCTCGGC
AGGTCGACGG
CCCTCGATGC
ATCCCGACGG
GATTGGAACA
CTTTCGCTCA
AAAGTAAGCT
AAGGCCCGGA
TCATGGGCAT
CCCATGTCGG
ATGAGACCCT
CAGGCCGGCA
TTCTGATTGT
TCATCCTTAG
ATGCCCGGGA
TCATGGGCGA
CCGAGGTCGA
GTCGTAAGGG
GAAGGAATCG
GCAATAAAAG
TGCTGGACCT
CAGTATTCCA
3
AGGATCGGGA
GTCGGCATAG
GGTAATGGAA
TTCTTTGTTC
TGCGCGCTGC
AAAGAAGCCT
GTTCGGCTGA
ATAGACGTTC
CATCATTTCT
GTTCTGGTTG
GCCGCCCATG
CTGGTTGGCA
AGCCTGGGTG
TTCGACACGA
GACGACGGAT
GGAATCGTAA
GTCATTGTAC
CAGGGACATG
CTGGACGACG
CTTTACCGTA
GCCGTTGCCC
ATTGACGGTA
GAATTTGGCA
CTGTACACCG
TTTCAACCAT
AAGTATAAAA
AGGAAGGATG
ATTCGGCCTC
CCAGACGGAA
GACCGACCAT
AGGAAGTATC
CCTGGAAAAG
GGGGGAATGG
CACCAGTATC
GGCCGGAGAA
GACGTACCTT
CCATAAGGTC
CTTCTGCCAG
CATCGTCCTC
AGCCTTAGCG
GCGACAGAAT
CTTAATGATT
AGTCTGCATT
ATTCATGGCA
CAAAGGCATA
GGTAGCCAAA
GATTGGGGTC
CTTCGCGCTT
AGCTGTTGAC
AATGCTTCAA
ACTGGGTCAC
CGACAGGAGC
CCCACTGTTT
GCGGGAAAGC
GCATGACGCC
GGTCGGGCCT
GCNGGCGCTG
TTATCCGGGA
GTGCGGACTG
GCGCCGTGAG
CATTTCTGAG
CAATCCATGG
TTCGGGTTCT
TCCGTATATT
ATCGTCCCCG
TTAGAATTGG
TCGTGGCGCT
ATGACGTCCG
CGGCGGCGCG
ATGATACCGA
GGATCATCGA
TTATGGCCTG
CCGCTGGTCG
TTGCCGTTGG
CCGTCGCCGA
GTATAGCGGG
TAGGAACCGG
AAATCCTTCC
ATATCGACAT
CCTGCCAGGA
GGAATGAAAC
TGTCGTTTCA
TTCCTATACA
TGTATACTCA
ATCGACGAAA
GGCGGGCCGG
CCCCTAGAAG
GACTACGCCT
GAATTTCACC
TGGCTCGGCG
GGCGGAGCCT
ATCGCCTATA
TGCCGGGAAA
TTTCAATTCG
AATCTGGCTG
GGCGGCGGCA
GCGCGATTGC
GATGCCGGCA
CAGTTTGGTG
GTTCTTATAT
GACCAGGACT
GTGTTTGGCC
GGTGTGGCGC
CAAAGCCGGG
GATGAGGTTC
GCTGTCCCCG
GGAGCACCAG
GACGCCGGGG
GAAATTGACA
GGCCTGCTCC
CTTGCGCATG
GCTCTTCTTG
GTCCATATTG
GGCCTGGTCG
Biol 3200 Assignment 2, March 18, 2009
Appendix 2. Schematic diagram of pUC19 (reproduced from www.neb.com)
4

Similar documents