Using Search Engines
This presentation was prepared for an ad-hoc 'Using the
Internet' session in Auchencairn Village Hall on 23rd Novembar
2000
Searching the Web
Crawlers
- Largely automatic
- 'Robot' or 'Spider' automatically traverses the Web,
following links to find new sites
- Indexes the site automatically using full text
indexing
- Will eventually find any site which is linked to from any
site which is already known.
- Originally
- An academic research project to explore the Internet
- Now
- A stand-alone business funded largely by advertising
- What's good about it
-
- The oldest...
- Nothing special these days
- Originally
- Technology demonstrator for Digital's superfast Alpha
machines (written in Scotland)
- Now
- A stand-alone business funded largely by advertising
- What's good about it
-
Alta Vista's search language
- "exact phrase"
- Finds pages which contain exactly this phrase
- +important word
- Finds pages which include 'important', preferring those
which also include 'word'
- important -word
- Finds pages which include 'important' but not 'word'
- this near that
- Finds pages which include 'this' near 'that'
- this and that
- Finds pages which include both 'this' and 'that'
- this or that
- Finds pages which include either 'this' or 'that'
- host:www.jasmine.org.uk
- Finds pages just on our Web server
- domain:org.uk
- Finds pages on all servers in the '
org.uk
'
domain, that is personal and not-for-profit sites in the
UK
- image:auchencairn
- Finds pages with pictures of Auchencairn...
- link:www.jasmine.org.uk
- Finds pages which have links to our server
- title:Jasmine
- Finds pages with 'Jasmine' in the title
- auchen*
- Finds pages with words which start with 'auchen'
- Originally
- A business set up to operate a search engine
- Now
- A business set up to operate a search engine, sells
'value-added' search services, slightly business
oriented
- What's good about it
- Categorisation system ('Search Folders')
- Originally
- A business set up to operate a search engine
- Now
- A business set up to operate a search engine, slightly
geek oriented
- What's good about it
-
Fast Search
- Originally
- A business set up to operate a search engine
- Now
- A business set up to operate a search engine, sells
technology and database to other search engines.
- What's good about it
-
- Not American... based in Norway
- Fast
- Second largest number of pages indexed (this
month)
And the rest...
Indexes
Largely hand-maintained -- very labour intensive.
Consequently far fewer good ones than with search
engines.
- Originally
- A hobby project set up by two students (David Filo and
Jerry Yang)
- Now
- A stand-alone business funded mainly by advertising (now
charges for commercial organisations to be listed)
- What's good about it
-
- Originally
- Volunteer project to produce comprehensive Web index
- Now
- Volunteer project to produce comprehensive Web index
- What's good about it
-
- Even more comprehensive than Yahoo
- Good index
- You can join in
- Originally
- Yellow pages online
- Now
- Yellow pages online
- What's good about it
- Yellow pages online
- Originally
- Innovative project to produce national index for
Scotland
- Now
- Rather tired national index for Scotland, not very
actively promoted
- What's good about it
- Quite useful for Scottish resources
Searching Usenet
What is Usenet
- Huge, ancient, anarchic distributed discussion
system
- Older than the Internet but now mostly works on the
Internet
- Parts of it very unsafe for the unwary
-
Always worth
- Originally
- A very successful (and profitable) archive of Usenet
- Now
- A loss-making consumers guide, currently for sale.
- What's good about it
-
- Comprehensive archive of recent postings to
Usenet
- Good search facilities
(Since this presentation was given, DejaNews has been taken
over by Google and become groups.google.com)
Others
- There used to be a number of alternatives to Deja, but
they all seem to have disappeared or become extra-cost
services
- Usenet produces very large amounts of data, so storing it
is expensive.
- Also, there is a lot of traffic on Usenet which is
extremely dodgy, so systems which archive it face potential
legal problems.
Other guides and tutorials
There are a number of useful guides to searchin the Internet
out there. We would recommend
Simon Brooke
Last modified: Wed Feb 13 17:57:21 GMT 2002