Bridges Logo 

Verity Ultraseek Search Engine 

Robot Exclusion  

Search Customization  

Contact Us

Verity Ultraseek Search Engine  

A search engine is a program that searches a dataset. On the World Wide Web, this engine is most often used for searches through databases of HTML documents gathered by robots or spiders.  

The Verity Ultraseek server is the state search engine. A spider associated with this server crawls all state agency Web sites overnight, during off peak times. In addition, customized interfaces, such as the Bridges search interface to Minnesota Environmental Agencies allow searching of portions of the state's Web sites.  

Verity Ultraseek logo 
 
 

You can add a search capability directly from your Web page.
Full text information on state agency Web servers is spidered and deposited in the search engine database. When queried, the search engine retrieves information from the database to find items matching your query. Information stored within discrete Dublin Core elements helps match search queries with corresponding information located in the resource. The full text of the resource is searched and results are displayed according to the quality of the match. This quality is defined by words that repeatedly correspond to the search query or in matches in the title of the document and the URL. Results are then displayed. 

The Ultraseek spider searches the public areas of Minnesota State Government Web sites. Only parts of the Web site that are connected to the root URL will be searched; the spider will not crawl through the network. This root URL would look like www.agencyname.state.mn.us. For more information on excluding parts of a site, such as test or administrative areas, use a robots.txt file. 

Currently, the Ultraseek robot is not spidering Web pages generated from databases, network files or Common Gateway Interface (CGI) scripts. Non-Web networks will also not be spidered for HTML pages. For those with Intranets, use of  a spider is not possible beyond  password protection. 

What if I add new resources and I want the spider to include it? 

Use the ADD URL feature. 
This function allows users (usually Web developers) to add a URL to the collection for indexing. This URL must match a set of patterns already in place for a given collection -- Environmental Information or State of Minnesota. For example, in the State of Minnesota collection, a URL must contain the string "state.mn.us". There are exceptions, however, such as URLs that have .org, which are handled separately. The advantage of using ADD URL is that it allows URLs of new pages or those with major changes to be submitted for almost immediate processing. 

The spider will usually gather a new URL or resource on the next visit to your site -- usually within 24 hours. 

Information on how to direct search engine robot navigation within your Web site
Resources 

The following resources can be found through a State of Minnesota search: 

  • HTML documents
  • Portable Document Format (PDF)
  • Geographic Information System (GIS)
  • Database Information
Current Settings 

The following settings are currently in use: 

  • Disallow URLs to CGI scripts
  • Maximum number of directories in a URL = 10
  • Maximum number of hops from root URL = 100
  • Languages allowed = any
  • Documents are considered duplicates if they are identical or have identical metadata
  • Documents have higher relevancy ranking if the search words are found in the metadata, including Dublin Core elements. Weighted elements include: Title, Description, Subject/Keywords, Alt Attribute, Remote Anchors
  • The minimum revisit is one day, maximum is 32 days. The spider tunes itself according to frequency of updates in a Web page.
Thesaurus 

The State of Minnesota Thesaurusis a comprehensive, cross-indexed set of subjects intended for search assistance. It is based on current vocabularies, including the: 

Legislative Indexing Vocabulary  
Minnesota GIS Community 

The thesaurus allows like communities to use a common vocabulary when describing and detailing Web resources. 

 
Bridges Logo 
   Minnesota Government links - North Star       Contact us via E-mail.  
Updated December 12, 2005