On the ‘General’ settings tab, you can change the interface language, crawling speed, and basic crawling settings.
It is possible to choose either English or Russian interface language in Netpeak Spider. Click on the button with a corresponding name and choose a necessary option from a dropdown list.
Please note that the program has to be restarted to make the settings fully come into force.
2. Crawling speed
2.1. A number of threads
Each thread creates a separate connection with a website, so please be careful as sensitive to load websites may struggle with displaying information. It is possible to adjust the number of threads during crawling to find an optimal value for the analyzed website. By default, a number of threads is equal to 10.
2.2. Delay between requests
It is the amount of time between each query to a web server. For sensitive to high load and protected websites, it is recommended to set this parameter up to prevent the overloading or to overcome website protection.
The delay is separately applied for each thread, that is why it is recommended to use one thread and a 1500-3000 ms delay between requests to imitate user behavior.
2.3. Response timeout
This is the maximum waiting time for server response measured in milliseconds before the crawler considers a page as broken with the ‘Timeout‘ response code and switches to the next URL. This setting also impacts detecting ‘Connection Error‘.
- Minimum possible value – 50 ms.
- Maximum possible value – 90 000 ms.
3. Basic crawling settings
3.1. Crawl only in directory
The program will crawl the site inside a particular category without leaving it.
Please take into account that Netpeak Spider orients itself according to a segment in a URL of a page. Consequently, the website must have an appropriate structure to use this mode. Thus, during crawling inside a category of product pages by the address example.com/category-1, goods from example.com/category-1/product will be included to reports but product pages with the address example.com/product will not because their URLs starts from a different URL section, even if the crawled category has links to these pages.
3.2. Crawl all subdomains
If it is checked, subdomains will be considered as a part of the analyzed website and links to these subdomains will be considered internal. Otherwise, all the results received from the subdomains will be not be considered a part of the crawled website and links to them will be considered external.
3.3. Crawl external links
Choose this parameter to add all external links to the main table. Note that the same parameters and issues are checked for external links as well as for internal ones. Thus, the ‘Issues‘ panel will show the total number of issues for internal and external links. However, it is possible to create a report only for external links using the segmentation feature.
3.5. Check images
We recommend enabling this configuration because:
- It allows the program to collect common SEO-parameters for images.
- It affects detection of the ‘Broken images‘ and ‘Max image size‘ issues.
3.6. Check other MIME types
This setting stands for collection of information about documents, video and audio files, etc. As well as with previous files, Netpeak Spider doesn’t scan their content but collects their common SEO parameters.
You have the ability to use the built-in templates for a specific crawling method: from the default template, suitable for most standard SEO tasks, to the method that allows crawling websites the similar way as search engine robots do.