Help


Scope window


Use the Scope window to specify the desired "scope". This means how far down DeepTrawl burrows through the directories and sub domains, and whether to limit the trawl to certain pages.


To open the Scope of Trawl window:




Settings - Include tab


Trawl pages within

Specify which pages to trawl, relative to the specified start page:


The domain of the start page and any sub-domain

The default setting, designed for sites which include multiple directories and may have multiple sub-domains. All the pages within the domain are trawled, including those within sub-domains.


For example, if the start page is http://MyDomain.com/index.htm, the following pages would be included: http://MyDomain.com/contact.htm; http://MyDomain.com/products/p1.htm; and http://MySubDomain.MyDomain.com/index.htm.


The domain/ sub-domain of the start page only

Designed for sites which include multiple directories within the same domain but not sub-domains. Only the pages within the domain or sub-domain specified in the start page URL are trawled.


For example, if the start page is http://MyDomain.com/index.htm, then pages in scope would include http://MyDomain.com/contact.htm and http://MyDomain.com/products/p1.htm, but NOT http://MySubDomain.MyDomain.com/index.htm.


The directory of the start page and all sub-directories

Designed for sites which only occupy a few directories on the server, e.g. a blogging site where the domain name is shared. Only the pages in the directory of the start page and its subdirectories will be trawled.


For example, if the start page is http://MyDomain.com/MyArea/index.htm, then pages in scope would include http://MyDomain.com/MyArea/contact.htm and http://MyDomain.com/MyArea/SubDir/contact.htm but NOT http://MyDomain.com/AnotherArea/index.htm.


The url entered as the site address in the main window must end with "/" in order for this to work.



The directory of the start page only

Designed for sites which only occupy a single directory on the server. Only pages from the specified directory will be trawled.


For example, if the start page is http://MyDomain.com/MyArea/index.htm then http://MyDomain.com/MyArea/contact.htm would be in scope, but not http://MyDomain.com/MyArea/MyDir/index.htm or http://MyDomain.com/AnotherArea/index.htm.


The url entered as the site address in the main window must end with "/" in order for this to work.




Trawl

Alternatively, specify pages to trawl according to their URL:


Start page only

Only the start page and referenced css files will be trawled.


Any page whose URL includes the text below:

Useful for trawling sites...


Only pages whose URL includes the specified text will be trawled. If the capitalization matters, select Case sensitive.


Advanced scope settings

Click this link to open the Advanced Settings window Scope tab.



Settings - Exclude tab


Exclude list

Enter a list of full or part URLs to exclude from the trawl, one per line. For example, add http://www.MyDomain.com/myFile.htm to to exclude only myFile.htm, or http://www.MyDomain.com/myDir/ to exclude all the pages in the myDir directory.