The Internet - Husar

Transcription

The Internet - Husar
The Internet
Introduction
Search engines
7KH ,QWHUQHW
Biosci Newsgroups
Link collections
Important entry sites
Comparison HUSAR / Internet
HUSAR
Introduction
Several basic sources of information
It is impossible to present all the resources on the
entire WWW
Search
Search engines
engines
Metasearch
Metasearch engines
engines
Interesting sites are added every week
As the internet changes and grows, many interesting sites may be boring tomorrow
Our object for this session
• Not primarily where to get information but:
• How to get to information (search strategies)
Medline
Medline
Electronic
Electronic journals
journals
The ‘best’ method doesn’t exist; therefore we present a personal view !
All the indicated links can be found at:
HUSAR
Newsgoups
Newsgoups (BIOSCI)
(BIOSCI)
Homepages
Homepages // portals
portals
White
White lists
lists
genome.dkfz-heidelberg.de
HUSAR
The internet is a rich source of information
The Internet
... But you have to combine the right question with the right source !
I need info on Dr. Complexname.
I need info on baldness.
Search
Searchengines
engines
Metasearch
Metasearchengines
engines
Medline
Medline
Electronic
Electronicjournals
journals
I need info on Dr. Smith,
White
Whitelists
lists working at Baldness University
HUSAR
Introduction
How do I purify DNA from hair ?
Is there a database of hair growth
related proteins ?
Search engines
Newsgoups
Newsgoups(BIOSCI)
(BIOSCI)
Biosci Newsgroups
Homepages
Homepages//portals
portals
Link collections
I need info on Dr. Smith,
member of National Baldness
Society.
I need info on baldness related
diseases/syndromes
Important entry sites
Comparison HUSAR / Internet
HUSAR
HUSAR
1
Search engines: shapes and sizes
Do not rely on just one search engine
TMHMM
Krogh
TMHMM
HUSAR
Venter
&Krogh
• AltaVista and Northern Light are two of the largest search engines on the web.
• FAST Search aims to index the entire web.
• Excite is a medium-sized index but uses concept searching.
• Companies can pay money to GoTo to be placed higher in the search results.
• Google is a search engine that makes use of link popularity to rank web sites.
• Yahoo is the largest human-compiled directory to the web, employs 150 editors
• Specialized search engines: Biofinder, www.biologie.de, BioHunt, Pasteur NetBook
Multiple search engines query several other search engines in parallel
• Examples: Metacrawler, DogPile, MetaFind, Cyber 411, Savvysearch
Yahoo
15
2/0
4
5/0
FAST
99
23870/1
23
3965/0
Altavista
41
19186/?
Excite
40/30
90/2
50/2
60/2
>150/3
Metacrawler
20
44/1
11
21/1
30/3
19257/15 3946/>1
4/0
56000/>1
26000/>1
Total hits: blue Relevant hits: red
Search engines may employ different AND / OR rules
cuiwww.unige.ch/meta-index.html
cuiwww.unige.ch/meta-index.html
www.monash.com/spidap4.html
www.monash.com/spidap4.html
www.library.carleton.edu/staff/terry/websearch/
www.library.carleton.edu/staff/terry/websearch/
HUSAR
Comparison of search engines
HUSAR
The Internet
Introduction
http://searchenginewatch.com
http://searchenginewatch.com
Search engines
Biosci Newsgroups
Link collections
Important entry sites
NorthernLight AltaVista Excite
INKtomi GOOgle InfoSeek Lycos
YaHoo MicroSoft NetScape
http://searchenginewatch.com
Comparison HUSAR / Internet
HUSAR
Usenet Biosci Newsgroups
HUSAR
Usenet Biosci Newsgroups
Object
to organize discussions on
a large variety of topics
Advantages
• simple to complex questions
• resources are scientists
• Netscape and newslist format
Disadvantages
• traffic can be too high or too low
• resources are scientists
• spam !
HUSAR
HUSAR
2
Usenet Biosci Newsgroups
The Internet
Access over a newsreader
(e.g. Pine) is also very
convenient. Mailing lists or
reading by Deja Vu is also
possible.
Introduction
Search engines
Biosci Newsgroups
Link collections
Instructions on how to
install Usenet newsgroups
are provided by
www.bio.net
Important entry sites
Comparison HUSAR / Internet
HUSAR
Useful link collections
HUSAR
The Internet
Introduction
Search engines
Many, many links for molecular biology.
Internet problem: last update 1996
Biosci Newsgroups
Link collections
Important entry sites
Many links to a wide variety
of databases.
Comparison HUSAR / Internet
Many, many links. Highly scientific
HUSAR
EMBL / EBI
HUSAR
EMBL / EBI
Keywords: molecular
biology
•
Proteomics
•
•
regular newsletter about
the EBI and
Bioinformatics
•
http://industry.ebi.ac.uk/
Datamining, EST´s,
gene prediction, Java,
microarrays, sequence
analysis, visualisation
and Web technology.
•
•
•
•
•
•
Home of databases
EMBL, TREMBL and
Swissprot
Mitochondrial database
Ligand/Receptor
database
Home of European
Drosophila Genome
Project and Flybase
Original home of SRS
Macromolecule structure
Large array of
downloadable software
HUSAR
Database of databases
HUSAR
3
EBI´s Biocatalog
http://www.sanger.ac.uk
Keywords: large scale sequencing and analysis
This includes major databases and analysis software
Home of Pfam
(Proteins Families Database of alignments and HMMs)
Home of AceDB
(managing of genome project data)
and EMBOSS
(The European Molecular Biology open software suite)
An abundance of tools, e.g.
Victor Solovyev‘s gene prediction software
HUSAR
HUSAR
NCBI
NIH
Keyword: major US
site for sequence analysis
•
•
•
•
•
•
•
•
•
Keywords: research, funding,
USA science politics
ENTREZ search and
retrieval system
Pubmed
home of BLAST
home of GENBANK
Unigene database
COGs (cluster of
genomic groups)
dbSNP Single
Nucleotide
Polymorphisms
database
600+ genome maps
Tools such as ORF
Finder and e-PCR
•
•
25 separate institutes
huge amount of data
HUSAR
NIH
•
•
HUSAR
The Institute for Genome Research
Local search engine
Still difficult to get to relevant data
Keyword: genome projects
http://www.tigr.org/tdb/
Abundant software, including
Genomes databases. E.g.:
• >20 microbes
• Parasites: Trypanosoma
brucei and Plasmodium
falciparum
• Human,
• Arabidopsis
A system for finding genes
in microbial DNA
MUMmer for aligning
whole genome sequences
Sequence clean-up program
HUSAR
HUSAR
4
Institut Pasteur: Bio Netbook
GenomeNet
Keywords: metabolic pathways / proteomics / metabolomics
KEGG
• Metabolic pathways
• Regulatory pathways
• Disease Catalogs, Cell Catalogs
• Molecule Catalogs; compounds
and enzymes
• Gene Catalogs
• Genome Maps
• Gene Expression Profiles
• Computational Tools
• Links to other pathway and
compound sites
• The Bio Netbook is a search engine especially designed for biologists
• Its index contains only biological expressions (2945)
• Growing database
• The homepage www.pasteur.fr contains a large amount of additional
information
HUSAR
HUSAR
GenomeNet / KEGG
GenomeNet / KEGG
Regulatory Pathways
Gene Expression Profiles
Still preliminary character
Clicakable signals allow identification of enzyme
Metabolic Pathways
Graphical pathway maps and ortholog group
tables
Maps are fully interactive
HUSAR
HUSAR
More about proteomics
ExPASy
Keywords: Proteins /
proteomics / applications
Gene Expression Profiles
Still preliminary character
http://bodymap.ims.u-tokyo.ac.jp
Many applications
Largest collection of biology
links on the WWW
(few outdated)
High quality
search engine for
biologists
HUSAR
HUSAR
5
ExPASy
ExPASy
SWISS-MODEL, An
Automated
Comparative Protein
Modelling Server
Swiss-PdbViewer is an
application that provides
a user friendly interface allowing
to analyse several proteins at the
same time.
Software for 2D analysis
HUSAR
HUSAR
Pubcrawler
MIPS
• Keyword: Proteins and more...
• Databases of proteins (Protfam), RNAs,
mitochondrial sequences
• Genome projects of human, yeast and Arabidopsis
• Pathways, Proteomics
• Yeast ORFs and genes
• Small but comprehensive link list
• An alert utility sends you once per week, via email,
new database entries related to your field of study.
• ORPHEUS is a software system for gene prediction
in complete bacterial genomes and large genomic
fragments.
• It goes to the library. You go to the pub.
• Automatic system which searches PubMed or other databases as often
as you want with your keywords or sequences
• Similar systems exist as well, links are indicated on the PubCrawler
homepage
HUSAR
HUSAR
IMB Jena
HUSAR
Keywords: biotech and
molecular biology
• Many useful links, up to date
• Tools, databases, services
Keyword: Sequence analysis
• Sequence Retrieval System
• GDB
• OMIM
• AceDB
• Genecards is mirrored at
the DKFZ
• FAQ, Bioinformatics
information
• Link list (>200 links)
• Several free tools (e.g.
Genscan)
• ... and the HUSAR package
HUSAR
HUSAR
6
The Internet
Sequence analysis: Internet vs. HUSAR
Introduction
Number of applications
Up-to-date
Comprehensive
Databases
Speed
Security
Data storage
Handling of multiple or large files
Batch job utilities
User support
Development of customized tools
Training
Bug removal
Costs
Search engines
Biosci Newsgroups
Link collections
Important entry sites
The Internet
many
few
no
many
low
low
none
bad (copy/paste)
mostly absent
low
no
low
slow
no
HUSAR
>250
all
yes
>90
high
high
40MB
good
yes
high
yes
high
fast
low
Comparison HUSAR / Internet
HUSAR
HUSAR
Conclusion
The internet contains abundant information, the important
thing is to use clever strategies to find it.
You
Youcan
canalways
alwayscontact
contactus
usat:
at:
genome
genome @
@dkfz-heidelberg.de
dkfz-heidelberg.de
ififyou
you have
havedifficulty
difficulty locating
locating the
theinformation
informationthat
thatyou
youneed.
need.
HUSAR
7