The Continually Expanding Internet: how to find Quality Information NOLUG Presentation 27
Transcription
The Continually Expanding Internet: how to find Quality Information NOLUG Presentation 27
The Continually Expanding Internet: how to find Quality Information NOLUG Presentation 27th February 2009 Presented by Karen Blakeman http://www.rba.co.uk/nolug/ Photo: Oslo University College http://www.flickr.com/photos/damiel/1534329928/ 27 May 2010 Karen Blakeman www.rba.co.uk 1 This presentation is licensed under a Creative Commons Attribution 3.0 License Karen Blakeman RBA Information Services Tel: +44 118 947 2256 [email protected] http://www.rba.co.uk/ blog: http://www.rba.co.uk/wordpress/ Facebook – Karen Blakeman Twitter: karenblakeman 27 May 2010 Karen Blakeman www.rba.co.uk 2 What Google's homepage may look like in 2084 www.nytimes.com/imagepages/2005/10/10/opinion/1010opart.html 27 May 2010 Karen Blakeman www.rba.co.uk 3 Two points to remember.. 1. Google et al do not exist to help you find information 2. Search engines, and in particular Google, are temperamental beasts Do not attempt to apply logic to the way they work – therein lies the path to madness 27 May 2010 Karen Blakeman www.rba.co.uk 4 Types of search tools Humans – is colleague or a friend already working or has worked in the subject area? – who have you met at meetings, conferences? – discussion lists, trade/professional associations, bloggers, LinkedIn, Facebook etc. Search engines – different options for different types of information e.g. news, images Evaluated listings, subject listings, types of information Databases and peer reviewed sources multi search engine tools – search many search tools at once – or type in your search once and click on each search tool in turn 27 May 2010 Karen Blakeman www.rba.co.uk 5 How up to date are search engines? Not very You are searching an out of date index of the web and not the live web itself May takes days to months for a site to be added to the index Hierarchy of sites for updating Some tools keep links to dead pages for a long time Least up to date: – Google Most up to date: – Live Search, Yahoo 27 May 2010 Karen Blakeman www.rba.co.uk 6 A search engine‟s results may vary In content and presentation – from one minute to the next – different server being used – testing out different search and ranking algorithms Country versions – – – – 27 May 2010 different emphasis local content different interface different search features Karen Blakeman www.rba.co.uk 7 27 May 2010 Karen Blakeman www.rba.co.uk 8 General search techniques By default, the major search tools look for all of your terms in a page Use double quote marks around phrases – e.g. “climate change” To exclude pages containing a term, precede the term with a minus sign (-) – use with care Boolean search – OR, AND, NOT – must use capital letters for the operators – only OR works in Google and even that does not work well but worth trying more complex searches – Live.com, MSE360 and Exalead are best (Yahoo has withdrawn NOT and nested searches no longer work correctly – for example (directory OR directories OR database) AND (oil OR petroleum) AND Norway 27 May 2010 Karen Blakeman www.rba.co.uk 9 General search techniques (2) Focus your search on areas of the document – inurl: for example inurl:”climate change” • looks for your terms in the URL – intitle: for example intitle:”climate change” • looks for your term in the title of the page Search sites or domains using the site: command – chocolate labelling regulations site:europa.eu Imagine what you would like to appear in your ideal document and include those terms in your strategy Partially answer your question in your strategy – “A hippopotamus can run at” Use the file formats and domain search to refine your search 27 May 2010 Karen Blakeman www.rba.co.uk 10 File format search Use advanced search options to limit your search to file types or format: – – – – pdf or doc for government or industry/market reports xls for data and statistics ppt or pdf for presentations Search in at least Google and Yahoo, also consider Live.com Looking for experts on a topic, presentations, a „how to” guide”, general background on a subject, information on an organisation – – – – 27 May 2010 advanced search ppt or pdf format Slideshare http://www.slideshare.net/ authorSTREAM http://www.authorstream.com/ YouTube http://www.youtube.com/ Karen Blakeman www.rba.co.uk 11 Advanced Search options can vary depending on the country version of Google 27 May 2010 Karen Blakeman www.rba.co.uk 12 General search techniques (3) Repeat your key search terms in your strategy – chocolate production france belgium austria – chocolate production austria france belgium belgium belgium • give different results • In Google can enter up to 32 terms, Yahoo 250 characters Change the order of your terms – chocolate production france belgium austria – production france belgium austria chocolate • different results • See the summary and comparison chart for the major search engines at http://www.rba.co.uk/search/compare.pdf and http://www.rba.co.uk/search/compare.shtml 27 May 2010 Karen Blakeman www.rba.co.uk 13 Unique Google search features Automatically looks for variations on your terms – to force and exact match precede your terms with plus signs e.g. air +pollution Synonym search – precede your search terms with a tilde (~) e.g. ~banking – only works on English terms Numeric range search – – – – – – 27 May 2010 can be weights, distances, years, prices use Advanced Search screen or the search box on the Google home page search term(s) first value..second value unit of measurement toblerone 1..5 kg TV advertising spend forecasts 2009..2015 Karen Blakeman www.rba.co.uk 14 Unique Google search features (2) Proximity – use the asterisk (*) to stand in for one or more terms – macular * degeneration picks up • macular retinal degeneration • macula disciform degeneration • macular choroidal degeneration • macular vitelliform degeneration • macular pigmentary degeneration – separates the terms by one or more words • no information on maximum number of terms of separation 27 May 2010 Karen Blakeman www.rba.co.uk 15 Google - What‟s New Knol – “A unit of knowledge” – competing with Wikipedia – http://knol.google.com/ Google results may now include images, books, news, site summaries and links – varies depending on country version of Google Much improved Google Finance, worthy competitor to Yahoo Finance – http://www.google.com/finance – BUT country coverage of share prices not as good as Yahoo e.g. for Norway 27 May 2010 Karen Blakeman www.rba.co.uk 18 Google Finance 27 May 2010 Karen Blakeman www.rba.co.uk 19 Yahoo Finance 27 May 2010 Karen Blakeman www.rba.co.uk 20 Google SearchWiki Enables you to customise your results – move pages up or down the ranking, delete pages from your list – add comments to a page Must be signed in with a Google account Can interfere with Firefox add ons such as Customise Google Not available in all country versions 27 May 2010 Karen Blakeman www.rba.co.uk 21 Google plug-ins and add ons Google Toolbar for both Firefox and IE – search from your browser – direct search for highlighted terms – fully customisable Firefox Add-on – – – – Customize Google http://www.customizegoogle.com/ Add numbers to results Can “stream” , keep scroll down the page to see more results instead of clicking on the next page – Links to other search engines at the top of the results list, engines vary depending on search type e.g. web, new, images 27 May 2010 Karen Blakeman www.rba.co.uk 22 Design your own search engine For – regularly searched sites – selected sites on a topic – searching sites on a reading list Rollyo – http://www.rollyo.com/ – max 25 sites Google Custom Search Engines – http://www.google.com/coop/cse – at least hundreds of sites, maybe thousands! – can import lists of sites Cannot search password protected sources or sites where you have to fill in a form to access the information 27 May 2010 Karen Blakeman www.rba.co.uk 23 Google CSE Examples: – Netting the Evidence • http://www.google.com/coop/cse?cx=0043268979584776 06950%3Adjcbsrxkatm – AlacraSearch • http://www.alacra.com/alacrasearch – pipl • http://www.pipl.com/ – Chipwrapper • http://www.chipwrapper.co.uk/ can be hosted on your own site or on Google – http://www.rba.co.uk/sources/energy.shtml – http://www.google.com/coop/cse?cx=0143042123649627400 38:tui4ebh5r_a 27 May 2010 Karen Blakeman www.rba.co.uk 24 Create your own Google CSE on Google 27 May 2010 Karen Blakeman www.rba.co.uk 25 ..or host it on your own web site or blog 27 May 2010 Karen Blakeman www.rba.co.uk 26 Other search engines... Different coverage – – – – Level of indexing on web sites Sites included in the index Update frequency Amount of a page that is indexed Different search features Different algorithms for sorting results Compare search engines – http://ranking.thumbshots.com/ 27 May 2010 Karen Blakeman www.rba.co.uk 27 http://ranking.thumbshots.com/ 27 May 2010 Karen Blakeman www.rba.co.uk 28 Ask http://www.ask.com/, http://www.ask.co.uk/ Recent changes resulted in loss of features Suggests related topics Particularly good for searching blogs (but need to do a web search first to see the More option) new Q&A tab/more answers 27 May 2010 Karen Blakeman www.rba.co.uk 29 Exalead http://www.exalead.com/search/ Supports wild cards – asterisk (*) at the end of a word • pollut* finds pollute, pollutant, polluting etc. NEAR - finds words within 16 terms of one another – NEAR/n finds words within n number of terms one another • climate NEAR/3 change Approximate spelling, phonetic search (?) Regular expression (internal masking of letters) Feedback from users is that there is more European content that seems to be given priority 27 May 2010 Karen Blakeman www.rba.co.uk 30 http://www.exalead.com/ 27 May 2010 Karen Blakeman www.rba.co.uk 31 iSEEK http://www.iseek.com/ Clusters results into topics, people, places, organisations, date & time Search on a person gives priority to social media profiles “Education” option – more research oriented pages 27 May 2010 Karen Blakeman www.rba.co.uk 32 Live Search http://www.live.com/ Results tend to be more consumer oriented Has the most up to date database Possibly has the most extensive database of web pages Good image search option Blogs & RSS search http://search.live.com/feeds/ Revamped interface but no improvement in advanced search screen – best results by using commands e.g. filetype: and Boolean search Link commands, Books and Academic Live all gone 27 May 2010 Karen Blakeman www.rba.co.uk 33 MSE360.com http://www.mse360.com/ See reviews at – http://www.rba.co.uk/wordpress/2008/10/05/mse360-search/ – http://www.rba.co.uk/wordpress/2008/10/06/update-on-mse360/ Full Boolean nested search options Advanced search screen offers country, phrase, excluding terms, domain/site search Can use commands e.g. filetype: , site; Results show web, video, images, Wikipedia and blogs Quick to respond to bug reports and fix problems 27 May 2010 Karen Blakeman www.rba.co.uk 34 Yahoo! http://search.yahoo.no/ http://search.yahoo.com/ Results are ranked in a different order to Google Boolean AND, OR – NOT no longer available – use the minus sign. – parentheses no longer work Indexes first 500 K of a document (Google 101 K) Region command (inherited from Inktomi) region: – e.g. region:europe, region:mediterranean – others are africa, asia, centralamerica, northamerica, southamerica, mideast, southeastasia, downunder 27 May 2010 Karen Blakeman www.rba.co.uk 35 Yahoo! 27 May 2010 Karen Blakeman www.rba.co.uk 36 Compare search engines Graball.com – http://www.graball.com/ – compares two search engines of your choice side by side TripleMe – http://www.tripleme.com/ – compares Google, Yahoo and Live side by side FuzzFind – http://www.fuzzfind.com/ – searches Google, Yahoo, Live, Del.icio.us Zuula – http://www.zuula.com – runs your search through a range of search tools one by one – order can be customised Browsys Powersearch (was Intelways/Crossengine) – http://www.browsys.com/powersearch/ – runs your search through a plethora of search tools one by one 27 May 2010 Karen Blakeman www.rba.co.uk 37 FuzzFind http://www.fuzzfind.com/ 27 May 2010 Karen Blakeman www.rba.co.uk 38 Zuula http://www.zuula.com/ 27 May 2010 Karen Blakeman www.rba.co.uk 39 http://www.browsys.com/powersearch/ 27 May 2010 Karen Blakeman www.rba.co.uk 40 Evaluated listings and customised search Evaluated subject listings Some examples: – Alacrawiki Industry Spotlights– http://www.alacrawiki.com/ – Intute – http://www.intute.ac.uk/ – Pinakes – http://www.hw.ac.uk/libWWW/irn/pinakes/pinakes.html Heavy human involvement – evaluation and assessment of content – only the home page or relevant section of a site is listed Customised search engines – AlacraSearch - http://www.alacra.com/alacrasearch/ – Chipwrapper – http://www.chipwrapper.co.uk/ – Pipl - http://www.pipl.com/ 27 May 2010 Karen Blakeman www.rba.co.uk 41 http://www.alacrawiki.com/ - spotlights 27 May 2010 Karen Blakeman www.rba.co.uk 42 http://www.alacra.com/alacrasearch/ 27 May 2010 Karen Blakeman www.rba.co.uk 43 http://www.alacra.com/alacrasearch/ 27 May 2010 Karen Blakeman www.rba.co.uk 44 Specialist search tools Think type of information – news, official company information, statistics, scientific, biomedical? Reference sources and peer reviewed, for example: – – – – – Wikipedia .org (yes, I know there can be quality issues!) Scirus.com TechXtra.ac.uk Google Scholar (possible quality issues) Google Books – especially for older material Structured databases e.g. Web of Science, Scopus, STN, Factiva, LexisNexis – often priced 27 May 2010 Karen Blakeman www.rba.co.uk 45 Scientific/Technical & Peer Reviewed Resources RefSeek – http://www.refseek.com/ Ten Science Search Engines http://hwlibrary.wordpress.com/2008/09/22/science-searchengines/ – – – – – – – – – 27 May 2010 Scirus – http://www.scirus.com/ Scitopia.org – http://www.scitopia.org/ Science.gov – http://www.science.gov/ ScienceResearch.com - http://www.scienceresearch.com/ Scitation - http://scitation.aip.org/ WorldWideScience.org - http://worldwidescience.org/ Science Accelerator - http://www.scienceaccelerator.gov/ TechXtra – http://www.techxtra.ac.uk search.optics.org - http://search.optics.org/ Karen Blakeman www.rba.co.uk 46 Scientific/Technical & Peer Reviewed Resources Highwire Press http://highwire.stanford.edu/ PubMed Central Homepage http://www.pubmedcentral.nih.gov/ UK PubMed Central http://ukpmc.ac.uk/ DeepDyve http://mysearch.deepdyve.com/start.php Google Scholar – http://scholar.google.com/ – use with caution 27 May 2010 Karen Blakeman www.rba.co.uk 47 Google Scholar http://scholar.google.com/ No source list Both peer-reviewed and un-reviewed articles, pre-prints, institutional repositories, references to books, citations Excludes Reed Elsevier Author search unreliable, search on year of publication unreliable But – And the winner is: Google Scholar! http://74120.weblog.leidenuniv.nl/2009/02/24/and-the-winneris-google-scholar – Google Scholar Search Performance: Comparative Recall and Precision – http://tinyurl.com/c7ta6s 27 May 2010 Karen Blakeman www.rba.co.uk 48 Google Scholar “Google Scholar is brain damaged” Peter Jasco, Trends in Professional and Academic Online Information Services, presented at Inforum , 22nd May 2007, Prague Does not use publishers‟ meta data Cannot differentiate between author, affiliation, geographic location, titles and headings – author:bagsvaerd 115 – author:acknowledgements 158 – author:glossary 471 Cannot differentiate between publication year and page numbers 27 May 2010 Karen Blakeman www.rba.co.uk 49 Google Scholar 2540 documents published in 2011 or 2012! 27 May 2010 Karen Blakeman www.rba.co.uk 50 Scirus http://www.scirus.com/ Scientific, scholarly, technical and medical information Reed Elsevier journals Also web sites, patents and pre-prints Good advanced search features – date searching, author searching etc. 27 May 2010 Karen Blakeman www.rba.co.uk 51 Scirus TechXtra http://www.techxtra.ac.uk ICBL and the Library at Heriot-Watt University, Edinburgh Articles, key web sites, theses and dissertations, books, industry news, new job announcements, technical reports, eprints Engineering, mathematics and computing Free information and pay per view 27 May 2010 Karen Blakeman www.rba.co.uk 53 Books Amazon Google Books http://books.google.com/ – can sometimes search inside the book and looks at individual pages – useful for older texts and suppliers of the book – Advanced search - search by year, author, title, ISBN Open Library http://openlibrary.org/ – 23,044,231 books, 1,064,822 with full-text Project Gutenburg http://www.gutenberg.org/ – different editions may be available e.g. Darwin‟s Origin of Species viaLibri http://www.vialibri.net/ Rare books from over 20,000 booksellers Book swap schemes – Turning over an old leaf – http://www.guardian.co.uk/environment/2008/may/01/ethicalliving.rec ycling – e.g. http://www.bookmooch.com/ 27 May 2010 Karen Blakeman www.rba.co.uk 55 News BBC – http://news.bbc.co.uk/ Search engine news options e.g. Google – last 30 days of free news – no source list, key industry publications may not be included – use country versions for prioritised local content Google News Archive http://www.google.com/archivesearch – some sources going back 200 years – many articles are priced (before you buy check other sources) Silobreaker - http://www.silobreaker.com/ Individual newspaper sites – http://www.abyznewslinks.com/ 27 May 2010 Karen Blakeman www.rba.co.uk 57 Silobreaker http://www.silobreaker.com covers free resources news, blogs, video, images market trends geographical location of stories people networks 27 May 2010 Karen Blakeman www.rba.co.uk 58 Images TASI Morguefile – http://www.tasi.ac.uk/advice/ using/finding.html images.google.com search.yahoo.com – images tab Ask – images tab Live.com - images Flickr.com Wikimedia Commons – http://commons.wikimedia. org/ Freefoto – http://www.freefoto.com/ – check the license – http://www.flickr.com/creative commons 27 May 2010 – http://www.morguefile.com US government web sites NASA – http://www.nasa.gov/ Karen Blakeman www.rba.co.uk 59 Audio & Video Google Video YouTube Yahoo Exalead Live.com Blinkx for news – http://www.blinkx.com/ Browsys Powersearch (formerly Intelways/Crossengine) – http://www.browsys.com/powersearch/ – Click on the video tab 27 May 2010 Karen Blakeman www.rba.co.uk 60 Audio & Video 27 May 2010 Karen Blakeman www.rba.co.uk 61 Blogs as sources of information Blogs by industry gurus and experts are a good way of keeping up to date with what is happening in a sector Look for the Blogroll of List of Links on a relevant blog Google Blogsearch http://www.google.com/blogsearch – use advanced search to search within an individual blog Ask http://www.ask.com/ – Blogs and feeds Blog search engines and directories – http://www.technorati.com/ – http://www.blogpulse.com/ 27 May 2010 Karen Blakeman www.rba.co.uk 62 Blogpulse search and trends Click on the graph to see ‘trends’ 27 May 2010 Karen Blakeman www.rba.co.uk 63 Blogpulse Trends Shows how often your search terms occur in postings – can compare up to three searches 27 May 2010 Karen Blakeman www.rba.co.uk 64 Twitter http://www.twitter.com/ Microblogging – postings are called „tweets‟ and 140 characters long See who is „following‟ whom Monitor conferences, what people are saying about companies, products, services http://search.twitter.com/ 27 May 2010 Karen Blakeman www.rba.co.uk 65 Twitter Reputation management What are people saying about you? – Oh dear! 27 May 2010 Karen Blakeman www.rba.co.uk 66 pipl http://www.pipl.com/ Review at http://www.rba.co.uk/wordpress/2007/05/05/pipl-peoplesearch-beta/ Searches „hidden‟ web + Google search – blog search, Google Groups, LinkedIn, Flickr, Google Scholar, Electoral Roll, Directories, Amazon, Hoovers, Zoominfo etc. – Google web search results not the same as an ordinary Google search – they incorporate terms such as resume, CV 27 May 2010 Karen Blakeman www.rba.co.uk 67 LinkedIn 27 May 2010 Karen Blakeman www.rba.co.uk 68 Facebook 27 May 2010 Karen Blakeman www.rba.co.uk 69 123People http://www.123people.com/ Searches – – – – – – – – 27 May 2010 image sections of major search engines Flickr Facebook LinkedIn Blogs Web Videos Email addresses Karen Blakeman www.rba.co.uk 70 Search visualisation tools Different ways of visualising results Show links between documents, search terms, people, organisations Can help identify alternative search terms, search topics 27 May 2010 Karen Blakeman www.rba.co.uk 71 kartoo.com 27 May 2010 Karen Blakeman www.rba.co.uk 72 Cluuz http://www.cluuz.com/ “Cluuz … core technology understands the relationship between the entities, terms, or persons searched leading to more relevant, easy to understand search results” Not totally intuitive but the network visualisation is „cool‟ The links in the network visualisation do not always relate to the same person or organisation but they are usually working in a similar field or subject area Results change from one day to the next, one hour to the next, but still worth a look 27 May 2010 Karen Blakeman www.rba.co.uk 73 Cluuz 27 May 2010 Karen Blakeman www.rba.co.uk 74 Quintura.com 27 May 2010 Karen Blakeman www.rba.co.uk 75 AllPlus.com 27 May 2010 Karen Blakeman www.rba.co.uk 76 „Disappearing‟ pages Search engine cache copies – Google, Yahoo, Live, Ask, Exalead Firefox users – install the Resurrect Pages add-on Wayback machine – http://www.archive.org/ – from 1996 to about 6 months ago – navigate the archived site or type in the full URL of the document if known 27 November May 2010 2006 Karen Blakeman www.rba.co.uk 77 Wayback Machine 27 May 2010 Karen Blakeman www.rba.co.uk 78