James S. Huggins' Refrigerator Door: Click here to go to my Home Page. search services summary

Search Services Summary


The Summary

This page provides a table summarizing and comparing the features of the free Search Services on my site. Following the table, is a description of each table heading and the feature it is intended to describe. In addition, there is a set of footnotes expanding on particular listings.

The Caveat

This page only shows the information for the free offerings of these different services. These services all offer paid options as well and each distinguishes the paid offering from the free offering in different ways. For example, the service may offer only a small number of pages for free and charge for more. Or, it may only offer certain features (e.g., automatic reindexing, or indexing of more file types) with a paid service. Readers are strongly cautioned against using this information to compare paid services.

I am working on the best way to present the paid information as well. In the meantime, caveat lector: reader beware.

The Bias

I am amazed by the number of people who ask me for my "objective opinion". Think about it! "Objective opinion" is clearly an oxymoron.

When you look at an "evaluation" the bias of the evaluator shows. This evaluation is clearly biased towards "free". It is also biased by the size of my site (small), the complexity (simple), my desire for "consistent" format and my love of control. These biases clearly influence what I look for and how I present it. Keep that in mind as you tour this table.

Summary TableSummary Table

Table HeadingsTable Headings

Table FootnotesTable Footnotes

 


My Search Pages

Simple Search PageSimple Search Page: The primary search page on the site. Includes the search forms, but not all the discussion.  (search)

Extended Search Pageextended Search Page: Includes everything that the Simple Search Page has, plus it has an extended discussion of the features of each of these search engines and some of my experiences in creating this section of my site.  (search_extended)

 


 

    

Search Services Summary Table

Basics Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Basics
Results Formatting Complete1 Configuration Configuration Wrapper Configuration Configuration Wrapper +2 Results Formatting
Search Options User Configuration Configuration User5 Configuration User Configuration Search Options
Page Limit 500 Unlimited4 1500 1000 Unlimited13 50009 1000 Page Limit
robots.txt Yes Yes Yes Yes Yes Yes10 Yes robots.txt
exclusion Spec Yes No Yes31 No Yes Yes Yes exclusion Spec
Robots Meta Yes No Yes Yes No No No Robots Meta
Partial Noindex Yes Nonstandard8 Yes7 Yes No No No Partial Noindex6
Ads or Logo Logo Ads Logo Ads Ads Logo Ads Ads or Logo
Results Context Yes No Yes No Yes Yes No Results Context
Results Language Yes19 English Only English + English Only English + English Only English Only Results Language
Search Language English + English Only English + English Only English Only English Only English Only Search Language
File Formats html txt pdf mp3 tag flash tag html (only) html txt  mp3 tag flash tag midi tag html (only) html (only) html txt html txt pdf37 File Formats
Indexing Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Indexing
Auto Reindexing Yes22 Yes17 No No No Yes23 No Auto Reindexing
Manual Limits None None None None None every 8 hours None Manual Limits
Online/Batch Both Batch Both Batch Batch Batch Both Online/Batch
Indexable Components Title, Description, Keywords, Body, Alt Text, URL24 Unknown Not Disclosed Title, Description, Body, Keywords, Alt Text25 Unknown Not Disclosed Unknown Not Disclosed Unknown Not Disclosed Title, Description, Keywords, Body Indexable Components
Multiple Roots Yes21 No Yes (3) No No No Yes Multiple Roots
Results Format Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Results Format
Templates 7 1 1 1 1 1 2 Templates
HTML Page Header Yes19 No No Yes33 Yes11 Yes Yes35 HTML Page Header
HTML Page Footer Yes19 No No Yes33 Yes12 Yes Yes35 HTML Page Footer
Display Site Name Yes19 No No Yes33 Yes No Yes35 Display Site Name
Link Back Text/URL Yes19 Yes No Yes33 Yes14 Yes14 Yes35 Link Back Text/URL
Page Title Yes19 Yes No Yes33 No Yes14 Yes35 Page Title
Page Heading Yes19 Yes No Yes33 Yes14 Yes14 Yes35 Page Heading
Image/Logo Yes20 No Yes Yes Yes14 Yes14 Yes Image/Logo
Selectable Service Logo Yes20 No Yes No No No No Selectable Service Logo
Presentation Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Presentation
Sort: Configure Yes No No No Yes No No Sort: Configure
Sort: User Control Yes No No Yes Yes No No Sort: User Control
Sort Score
Update
Score Score Score
Update
Score
Update
Title
Score Score Sort
Per Page: Config Yes No No Yes Yes No (10) Yes Per Page: Config
Per Page: User Yes No No No Yes No (10) No Per Page: User
Per Page Options 5, 10, 25, 50, 100 10 10 5, 10, 25, 50 5, 10, 15, 20, 25 10 any value Per Page Options
Short Format Yes No No Yes Yes No Yes Short Format
Short Format: User Yes No No No Yes No No Short Format: User
Same or New Page Configurable19 Same Configurable Same Configurable Same Same Same or New Page
Fonts Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Fonts
Configure Face Yes19 No Yes Yes Yes No Yes Configure Face
Configure Color Yes19 No Yes Yes Yes No Yes Configure Color
Component Fonts Yes19 No No No No No No Component Fonts
Config Link Colors Yes20 Yes Yes Yes No Yes14 Yes Config Link Colors
Config Context Yes No No No No No No Config Context
Page Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Page
Background Color Yes19 Yes Yes Yes Yes Yes14 Yes Background Color
Background Image Yes19 Yes Yes Yes Yes Yes14 Yes Background Image
Long Format Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Long Format
Title Yes19 Yes Yes Yes Yes Yes Yes Title
Description Configurable19 Yes Configurable Configurable No No Yes36 Description
Context Yes19 No Configurable No Yes Yes16 Yes36 Context
Score Yes19 No No Configurable Icon Number + Icon No Score
Date Yes19 No No No No No Yes Date
Size Yes19 No No Configurable No No Yes Size
URL Yes19 Yes Configurable Configurable Yes Yes Yes URL
 Depth No No No No No Yes No  Depth
Special Links Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Special Links
Show Similar No No No No No Yes No Show Similar
Link to Parents No No No No No Yes No Link to Parents
Help Link Yes19 No Yes No No Yes15 No Help Link
Searching Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Searching
Match Options Any
All
Phrase
Minimal None None All
Any
Boolean
Many None Match Options
Sound Alike Yes No No No No No Yes34 Sound Alike
Search In Results No Yes No No No No No Search In Results
Search Specific Components Yes No No No No No No Search Specific Components
Scoring Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Scoring
Control Weighting Yes No No No No User26 Yes Control Weighting
Categories Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Categories
Site Category Yes Yes No No No No No Site Category
Site Map Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Site Map
Site Map No Yes30 No No No No No Site Map
Site Map Formats N/A Table
Outline
List
N/A N/A N/A N/A N/A Site Map Formats
User Can Switch Site Map Formats N/A Yes N/A N/A N/A N/A N/A User Can Switch Site Map Formats
<title> for Site Map N/A Yes N/A N/A N/A N/A N/A <title> for Site Map
Headline for Site Map N/A Yes N/A N/A N/A N/A N/A Headline for Site Map
Configure Site Map Separator N/A Yes N/A N/A N/A N/A N/A Configure Site Map Separator
Site Map Depth N/A Yes N/A N/A N/A N/A N/A Site Map Depth
Site Map Table Width N/A Yes N/A N/A N/A N/A N/A Site Map Table Width
Search Box on Site Map N/A Optional N/A N/A N/A N/A N/A Search Box on Site Map
Web Search Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Web Search
Web Search No Yes No No No No Yes Web Search
Advertisements Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Advertisements
Ad Source N/A Flycast
&
engage
N/A Not Disclosed Not Disclosed N/A Not Disclosed Ad Source
Ad Privacy Info N/A Yes N/A Not Disclosed Not Disclosed18 N/A Not Disclosed Ad Privacy Info
Ad Data Collection Opt Out N/A Yes N/A No No N/A No Ad Data Collection Opt Out
Character Sets Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Character Sets
Character Set encoding Support Yes27 No Yes28 No No No Yes38 Character Set encoding Support
Double Byte Support No No Yes No No No No Double Byte Support
Special Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Special
excluded Words Yes No Yes No No No Yes excluded Words
Synonyms Yes No No No No No Yes Synonyms
Site Subset Yes29 No No No No No Yes39 Site Subset
Frame Support Yes Yes Yes No No No Yes Frame Support
Password Support Yes No Yes No No No Yes Password Support
Administration Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek Administration
Usage Reports Yes Yes Yes Yes No No Yes Usage Reports
Indexing Log Yes No Yes No No Yes No Indexing Log
Indexing error Log Yes No No32 No No Yes No Indexing error Log
  Atomz FreEFind PicoSearch Searchbutton SiteMiner Thunderstone whatUseek  
 

    

 

    

Headings

Basics

Results Formatting: The ability to control the formatting of the search results. Options are Configuration, Wrapper and Complete.

Search Options: The ability to control search options such as maximum number to return, what parts of the page to search, how many results on a page, whether to display descriptions and/or context. Options are Configuration and User.

Page Limit: Maximum number of pages of the site that may be indexed.

robots.txt: Whether the robot (indexer) honors the standard robots.txt to exclude pages from indexing.

exclusion Spec: Whether the service provides the option to exclude pages, similar to a robots.txt file, but configured through the service administration pages.

Robots Meta: Whether the services support the Robots Meta tag. This tag can specify index/noindex and follow/nofollow to control whether a particular page will be indexed.

Partial Noindex: Whether the robot (indexer) honors the pseudo-standard <noindex></noindex> to exclude parts of pages from indexing. Yes indicates that the service honors <noindex></noindex>. If the service uses another protocol, it is indicated by "Nonstandard" and a footnote.

Ads or Logo: Whether the services uses ads to pay for the free service or relies on their logo to achieve branding.

Results Context: Whether the service has the ability to display the "context" of the found word. Context shows the part of the page containing the word being searched.

Results Language: Different languages supported for display on the results page. Primarily used to change the language used to display headings (e.g., "prev", "next").

Search Language: Different languages supported for display on the results page. Primarily used to change the language used to display headings (e.g., "prev", "next").

File Formats: The particular file formats that are indexed.

Indexing

Automatic Reindexing: Whether the service automatically reindexes the site and, if so, how often.

Manual Limits: How frequently can the reindex be requested.

Online/Batch: Some services perform the indexing in a "batch" or "background" mode and email you when completed. Some services also perform the indexing "immediately" in an "online" mode and provide a page to monitor the reindexing progress. So far, every "online" also does "batch"; for these I show "both".

Indexable Components: There are various components that can be indexed on a web (HTML) page. These include (a) Title, (b) Description, (c) Keywords, (d) Body, (e) Alt tags and (f) URL. Some services provide explicit information on what they index and provide the ability to select which to index and/or what "weight" to give the different components. In addition, some indexes permit querying for words within these "components" separately (e.g., looking for "Sclerosis" in the title of a page).

Multiple Roots: Some services restrict the indexing for a site to pages at one and only one root. (e.g., pages at www.JamesSHuggins.com). Some sites have pages under more than one root (or entry point) and some services permit specifying multiple roots as a way to index these multiple roots into one logical index.

Other Indexing Controls: At least one service also provides other ways to control indexing (e.g., pages can be restricted to one server).

Results Format

Templates: Services that offer "configuration" as one of their approaches to creating results pages may offer one or more "templates" as the basis for configuration.

HTML Page Header: Some services permit the specification of HTML for the page header of the results page. This is more than just specifying a logo, or title. The ability to include full HTML permits a high degree of customization of the results page, approaching "Wrapper".

HTML Page Footer: Some services permit the specification of HTML for the page footer of the results page. This is more than just specifying a logo, or link back text. The ability to include full HTML permits a high degree of customization of the results page, approaching "Wrapper".

Display Site Name: Services that do not permit HTML Headers, may permit display of the site name.

Link Back Text/URL: Services that do not permit HTML Headers, may provide a way to specify "link back text" and the URL to link back to.

Page Title: Services may permit customization or specification of the Title of the results page.

Page Heading: Services that do not permit HTML Headers, may permit specification of page header text.

Image/Logo: Services that do not permit HTML Headers, may permit specification of a personal image or logo to appear at the top of the page.

Selectable Service Logo: Services that use their logo as part of their branding effort, may permit the specification of which logo to use, in order to more closely match the look and feel of your page.

Presentation

Sort: Configure: Whether the service permits configuration of results sort order. Results may be sorted into three different orders: (a) score (a measure of the relevancy of the page to the request), (b) Update date (the date of change of the page), (c) Title of the page. Some services sort into only one order (typically score). Others permit configuration of the sequence.

Sort: User Control: Whether the service permits the user to specify the sort sequence at request time.

Sort: The possibilities for sorting the results.

Per Page: Config: Whether the service permits configuration of the number of results per page.

Per Page: User: Whether the service permits user specification of the number of results per page.

Per Page Options: The possible values for the number of results per page.

Short Format: Whether the service supports display of a "Short Format". The Long Format of a display typically includes the title, either the description or context or both, and perhaps, URL, size, update date and score. The Long Format of a display typically includes only the title. Some services only use a Long Format. Some support both.

Short Format: User: Whether the service permits the user to specify a Short Format display at search time.

Same or New Page: Whether clicking on a result link will open the result in the same page (default target

Fonts

Configure Face: Whether the font face used for the results listing can be configured.

Configure Color: Whether the font color used for the results listing can be configured.

Component Fonts: Whether the fonts specific to the (a) Title, (b) Description, (c) Context, (d) URL, (e) Size, (f) Update Date, and (g) Score, can be individually configured. If "No", then these various components of the listing will have the same font face and color, and possibly size.

Config Link Colors: Whether the colors used for links (Link, ALink and VLink) can be configured for the results page.

Config Context: Whether the highlighting used to identify words in context can be configured. (Typically this is a bold listing.)

Page

Background Color: Whether the background color of the results page can be configured.

Background Image: Whether a background image for the results page can be configured.

Long Format

Title: Whether the Long Format includes the page Title.

Description: Whether the Long Format includes the page Description.

Context: Whether the Long Format includes the context (i.e., the text containing the searched words).

Score: Whether the Long Format includes the relevancy score.

Date: Whether the Long Format includes the Update Date.

Size: Whether the Long Format includes the page size.

URL: Whether the Long Format includes the page URL.

Depth: Whether the Long Format includes the Depth. (Depth is the number of clicks from the home page that this page is. A page with a Depth of "1" is one click off the home page. A page with a Depth of "2" is two clicks.)

Special Links

Show Similar: Whether the results page includes a link to "Show Similar" pages (or an equivalent link).

Link to Parents: Whether the results page includes a link to the parents of the page.

Help Link: Whether the results page includes a link to "Help". (Typical help would include information on search options, and, in the case of complex button/selection forms, information on these options.)

Searching :

Match Options: Whether the search function supports options for matching (e.g., any word, all words, exact phrase).

Sound Alike: Whether the search function supports searches for similar sounding words.

Search In Results: Whether the search function supports searching only within the results of the last search. This is useful for narrowing a search.

Search Specific Components: Whether the search function supports searching for words in specific components of the page. For example, searching within (a) Title, (b) Description, (c) Keywords, (d) Body, (e) Alt tags, (f) URL.

Scoring

Control Weighting: Whether the relevancy score weighting (e.g., relevancy of Title vs Alt tags) can be controlled. "Yes" indicates it is configurable. "User" indicates that the user can alter the weightings at search time.

Categories

Site Category: Whether the service requests a "classification" or "category" of the website.

Site Map

Site Map: Whether the service creates a Site Map. (N. B.: At this time, only one service is known to create a Site Map. This section of the comparison may be eliminated or substantially reduced in scope.)

Site Map Formats: What different Site Map formats are available.

User Can Switch Site Map Formats: Whether the user can choose which Site Map format to see.

<title> for Site Map: Whether a page Title can be specified for the Site Map page.

Headline for Site Map: Whether a page headline can be specified for the Site Map page.

Configure Site Map Separator: Whether the separators used on the Site Map page can be configured.

Site Map Depth: Whether the Site Map depth can be configured.

Site Map Table Width: Whether the Site Map table width can be configured.

Search Box on Site Map: Whether the appearance of a Search Box on the Site Map page can be configured.

Web Search

Web Search: Whether the service also supports searching the web, or only searching the site. (N. B.: In creating the examples on this site, all web search options have been disabled and are not used.)

Advertisements

Ad Source: The source(s) for the ads appearing on the results pages. May include links to the sites involved.

Ad Privacy Info: Whether privacy information for the ads is disclosed. May include links to the sites involved.

Ad Data Collection Opt Out: Whether the sources for the ads provide the option to opt out of any data collection used to associate ads with other behavior.

Character Sets:

Character Set encoding Support: Whether the service supports recognition of different character sets. (N. B.: This is only of substantial interest if you use an alternate character set.)

Double Byte Support: Whether the service supports recognition of double-byte (e.g., Chinese) character sets. (N. B.: This is only of substantial interest if you use a double-byte character set.)

Special Features

excluded Words: Whether the service permits you to specify words not to index.

Synonyms: Whether the service permits you to specify a synonym list. For example, if you specify MS and "Multiple Sclerosis" to be synonyms, then searches for one will also find the other.

Site Subset: Whether the service permits you to create "subsets" or sections of the site for searching. For example, if you have a site on Baseball, you might create subsets or sections on Players, Teams, Leagues. And you might wish to permit a user to specify a search of a particular subset or section. Typically this requires segregating the subsets or sections into their own subdirectories. (N. B.: In creating the examples on this site, no subsets were created and all subset features of the services were "turned off".)

Frame Support: Whether the service can successfully process sites using frames.

Password Support: Whether the service can store passwords required to access sections of your site in order to successfully index those sections.

Administration

Usage Reports: Whether the service offers reports to show usage of the search service.

Indexing Log: Whether the service provides a log of the indexing performed on your site. Such a log permits you to see what indexes were created and to evaluate whether the indexing is working as you anticipated.

Indexing error Log: Whether the service provides an error log of the indexing performed on your site. Such a log permits you to see what indexes were not created (e.g., missing/broken links, documents not indexed because they are the wrong type, etc.) and to evaluate whether the indexing is working as you anticipated.

    

 

    

Footnotes

1. Atomz has the most complete scripting language of any so far.

2. whatUseek provides more than just a wrapper. It includes some scripting. But the scripting is not as complete as Atomz.

3. No longer applicable.

4. FreEFind sets an initial limit of 32MB. However, they claim it will be increased as needed for legitimate sites.

5. The Searchbutton User Controlled Search Options are not yet working on my site, but they are being helpful and I'm sure we'll get it worked out.

6. No longer applicable.

7. Proprietary implementation of Partial Noindex: <nosearchstart><nosearchstop>.

8. Proprietary implementation of Partial Noindex: <!-- FreEFind Begin No Index --> <!-- FreEFind end No Index -->

9. Thunderstone limits each individual page to 100,000 bytes. Several pages on my site are larger and will not fully index using this service

10. This option can be turned off.

11. The header appears "beneath" or "after" the header provided by the service.

12. The footer appears "above" or "before" the footer provided by the service.

13. May require increasing of intermediate limits by the support staff.

14. The Page Header HTML can specify this. Is not specified "separately".

15. The help links appear if the user clicks "Options" or if you include them on your page.

16. In addition to the results page showing some context, the link to "Match Info" shows the entire page with all matching words highlighted.

17. Reindexing options include (a) Monthly, (b) every Two Weeks, (c) every Week (on a specified day) and, (d) every day "if possible". Also permits specification of reindexing time.

18. I have been unable to identify any specifics for SiteMiner regarding their data collection regarding ads. However their Tracking Page promoting their advertising promises the prospective advertiser "Details about people who clicked on your banner".

19. I have shown this value for Atomz because you have absolute control of these characteristics through its complete scripting language. Although this characteristic is not "set" through a menu, it can be configured by editing the resulting template page or by constructing a totally custom page.

20. If you use one of Atomz's "templates", and choose to "configure" rather than exercise "complete control" over the results page, this is one of the options available. In addition, if you exercise "complete control" you have absolute control of these characteristics through its complete scripting language.

21. In addition to permitting each account to have multiple (apparently unlimited) URLs, Atomz also permits you to use one login to manage multiple accounts.

22. The configuration permits choosing a day of the week and a time of the day. It also permits turning off automatic indexing.

23. Every two weeks. Does not appear to be configurable or optionable.

24. The relative weight assigned to each component is configurable.

25. Each component in individually configurable to index, or not; however the relative "weight" assigned is unknown.

26. The Thunderstone engine permits the user to control component weighting at search time. A default is not configurable.

27. The specific character set is configurable.

28. The specific character set is recognized by evaluating the Meta Tags. See here.

29. Called "collections".

30. At this time, only this one service offers a Site Map. This preliminary information may be eliminated or substantially reduced.

31. Provides the ability to restrict indexing by Domain, Directory and Server.

32. While PicoSearch does not seem to have an explicit error log, the online indexing feedback does indicate indexing errors.

33. I have shown this value for Searchbutton because you have control of these characteristics through specification of your own Wrapper. Although this characteristic is not "set" through a menu, it can be configured through the editing of your Wrapper.

34. WhatUseek provides soundex. It is configurable at index time and not selectable at search time.

35. I have shown this value for whatUseek because you have control of these characteristics through specification of your own Wrapper. Although this characteristic is not "set" through a menu, it can be configured through the editing of your Wrapper.

36. The Long Format for WhatUseek can contain the Description or the Context, but not both.

37. WhatUseek notes in a private email that their spider looks at the actual, internal, file format and not the extension.

38. WhatUseek notes in a private email they do support Character Sets "for all Latin-based languages", but they have not provided specifics.

39. Called "slices".

    

 


Simple Search PageSimple Search Page: The primary search page on the site. Includes the search forms, but not all the discussion.  (search)

Extended Search Pageextended Search Page: Includes everything that the Simple Search Page has, plus it has an extended discussion of the features of each of these search engines and some of my experiences in creating this section of my site.  (search_extended)


 

Back to the main Search page

 

The extra text menu links (previously here) are being removed in the site redesign.
Browser and search engine improvements have eliminated the motivation/necessity for them.

 


NOTICE --- SITE CURRENTLY UNDERGOING EXTENSIVE REWRITE -
This particlar page HAS NOT yet been rewritten.   
It is part of the site that is in progress".

Click here for more information on the rewrite, including examples of the new database for link lists and the graphic-link-symbols.

Please email me about any errors you detect or questions/suggestions you may have.


 Explanation of the rewrite: New Page Layout.
Check out my blog My Ephemerae.
Yes ... I want you to link to my site. Please link to me.
Want to email me? I'd love to hear from you.
  I have begun tutoring in the Clear Lake, Texas area.
Site for my pending High School reunion: WildcatsLX.com

Copyright 1997-2014 James S. Huggins. All rights perversed.
Original content licensed under a Creative Commons License.
Web hosting provided by BlueHost.com.
Content management & SEO by The Eclectic Power Company.
Concerned about privacy? Read my Privacy Statement.
Trouble sleeping? Try the legal page.Honey, Honey, I've Got Honey!Here is my EMT Page.

This page created before: Wed, 16.Aug.2000

Last updated: 21:02, Thu, 01.May.2014

search services summary search services summary