Search Engine Logo

A Comparison of Search Engines For Finding Resources

By Yuanlei Zhang, April 28, 2004


1. Introduction
2. Different Types of Search Engines
3. Major Components of Crawler-based Search Engines
4. Comparing Search Engines:
- Crawler Comparisons
5. Comparing Search Engines:
- Index Comparisons
6. Comparing Search Engines:
- Search Commands Comparisons
7. Comparing Search Engines:
- Search Results Comparisons
8. Summaries
9. References



2. Different types of search engines

When people mention the term "search engine", it is often used generically to describe both crawler-based search engines and human-powered directories. In fact, these two types of search engines gather their listings in radically different ways and therefore are inherently different.

Crawler-based search engines, such as Google, AllTheWeb and AltaVista, create their listings automatically by using a piece of software to “crawl” or “spider” the web and then index what it finds to build the search base. Web page changes can be dynamically caught by crawler-based search engines and will affect how these web pages get listed in the search results.

Crawler-based search engines are good when you have a specific search topic in mind and can be very efficient in finding relevant information in this situation. However, when the search topic is general, crawler-base search engines may return hundreds of thousands of irrelevant responses to simple search requests, including lengthy documents in which your keyword appears only once.

Human-powered directories, such as the Yahoo directory, Open Directory and LookSmart, depend on human editors to create their listings. Typically, webmasters submit a short description to the directory for their websites, or editors write one for the sites they review, and these manually edited descriptions will form the search base. Therefore, changes made to individual web pages will have no effect on how these pages get listed in the search results.

Human-powered directories are good when you are interested in a general topic of search. In this situation, a directory can guide and help you narrow your search and get refined results. Therefore, search results found in a human-powered directory are usually more relevant to the search topic and more accurate. However, this is not an efficient way to find information when a specific search topic is in mind.

Table 1 summarizes the different types of the major search engines.

Search Engines Types
Google Crawler-based search engine
AllTheWeb Crawler-based search engine
Teoma Crawler-based search engine
Inktomi Crawler-based search engine
AltaVista Crawler-based search engine
LookSmart Human-Powered Directory
Open Directory Human-Powered Directory
Yahoo Human-Powered Directory, also provide crawler-based search results powered by Google
MSN Search Human-Powered Directory powered by LookSmart, also provide crawler-based search results powered by Inktomi
AOL Search Provide crawler-based search results powered by Google
AskJeeves Provide crawler-based search results powered by Teoma
HotBot Provide crawler-based search results powered by AllTheWeb, Google, Inktomi and Teoma, “4-in-1” search engine
Lycos Provide crawler-based search results powered by AllTheWeb
Netscape Search Provide crawler-based search results powered by Google

Table 1: Different types of the major search engines

From the table above we can see that some search engines like Yahoo and MSN Search provide both crawler-based results and human-powered listings, therefore become hybrid search engines. A hybrid search engine will still favor one type of listings over another as its type of main results.

There is another type of search engines that is called meta-search engines.

Meta-search engines, such as Dogpile, Mamma, and Metacrawler, transmit user-supplied keywords simultaneously to several individual search engines to actually carry out the search. Search results returned from all the search engines can be integrated, duplicates can be eliminated and additional features such as clustering by subjects within the search results can be implemented by meta-search engines.

Meta-search engines are good for saving time by searching only in one place and sparing the need to use and learn several separate search engines. "But since meta-search engines do not allow for input of many search variables, their best use is to find hits on obscure items or to see if something can be found using the Internet." [5]



This article is the term paper for IS567 - Information Network Applications taught by Dr. Gretchen Whitney at the School of Information Science in the University of Tennessee, Knoxville. Copyrights © 2004, All Rights Reserved