How Can I Reduce the Index Size?
DevTech has made every effort to make the index files as compact as possible,
and we continue to innovate new ways of storing more information in smaller
spaces. The index files can be made smaller at the cost of fewer
features available in the SiteSurfer applet. The most effective strategies
for accomplishing this are:
- Index fewer pages. The size of the index is directly
proportional to the number of pages it references. Only index those pages
that benefit the users of SiteSurfer. Pages can be trimmed manually using
the Index selected items option. SiteSurfer
can be configured to automatically trim pages by
excluding name patterns and using
robots meta tags and page link depth limits.
- Don't index numbers. Numbers embedded in text infrequently
make good candidates for searching, because they are formatted liberally, and
users are unlikely to know precise numbers in a page. By
disabling number characters, numbers will not waste
any space in the index. Beware of making this change, though, if users
are likely to search for things like "OS/2" and "Year 2000".
- Index Fewer fields. One of SiteSurfer's most powerful
features is its ability to index and search fields
other than text. This includes page titles, sizes, and last-modified dates.
SiteSurfer also has special handling for meta information, like keywords,
author, and abstract. While each field provides powerful searching abilities,
they do have a price in index size. You should balance the needs of people
who search the index and the size and download time the index files, and enable
only those fields that provide the most value.
- Make a site map only if you use it. The site map is
a powerful navigation aid, but if you never use it or have it disabled using
the SHOWVIEWS applet tag,then you should consider disabling
the site map option in the Builder. The site map adds
a relatively small amount of space in the index, but it could be
noticeable for very large sites, especially those with many anchor links inside
pages.
- Disable advanced search options.
Proximity and concept
searching are very powerful methods of searching text content. Even if you
do not normally explicity select these kinds of queries in the applet, SiteSurfer's
advanced ranking algorithm will still use some of their data to enhance the
ranking scheme, so you find the documents you're looking for more quickly. These
options do use additional space in the index, so they may be candidates
for removal if the index is too large.