- A proxy represents an intermediate server between a user device and a target server (website).
A proxy allows changing users IP address to:
- check a website behavior when accessing it from different locations;
- overcome the limit on sending requests in bulk to a web server or search engine from a single IP address;
- imitate network traffic to test a site server when it is at the development stage. In this case, it is recommended to use private proxy servers;
- speed up website crawling.
You can find proxies configuration under ‘Settings → Proxy‘. Tick the ‘Use a list of HTTP proxies‘ to allow controlling a list of proxies.
1. Ways of Adding Proxies
Netpeak Spider allows setting up a proxy in different ways:
- Manually by clicking on the ‘Enter manually‘ button located on the right side or using Ctrl+D hotkey. Add a list of proxies to the opened window. Each proxy should be placed on a new row.
Input format: Host:Port or Host:Port:Login:Password
- Using the table. Click on the ‘Add row‘ button or use the Ctrl+N hotkey to add a new proxy.
Keep in mind that:
- The only supported proxy IP-address format that can be set in the ‘Host or IP‘ field is IPv4;
- The ‘Port‘ field has 80 as the default value;
- The ‘Status Code‘ field displays a status code and describes a proxy condition. So, the proxies workability can be assessed by the status code. The most common codes and their meanings are:
- 200 OK – means that a proxy has access to the requested resource;
- 403 Forbidden – there is no access to a proxy and authentication can’t change it;
- 407 Proxy Authentication Required – a proxy requires authentication or entered login and password are not correct.
- The ‘Response Time‘ field displays the time a checked service takes to respond via a proxy.
- Locations of all columns can be changed as desired.
Regardless of the chosen way of adding proxies, you can add them as a list from the clipboard
- using context-menu
- using Ctrl+С → Ctrl+V hotkey combinations
- using the ‘Paste‘ button
2. How to Work With Proxies in Netpeak Spider
The proxies workability is automatically checked after they were initialized but if it is necessary to recheck them, use the ‘Check‘ button and choose the type of check:
- Internet access
Click on the ‘Stop‘ button if you need to stop the check.
It is also convenient to delete proxies: the program allows removing all proxies, chosen or dead ones. Use the ‘Delete‘ button and choose one of the suggested options.
During the crawling with a list of proxies, the requests will be sent to a crawled site from different IP-addresses. The order of using proxies depends on the number of threads set on the ‘General‘ tab Thus if one thread is set, each request will be alternately sent from the next proxy in the list. At the end of the list, the loop will start from the beginning. If several threads are used, then several requests will be sent from different proxies at the same time.
We recommend using paid proxies as they have some benefits:
- The possibility to choose the proxy specification (speed, response time, anonymization, etc.);
- The significantly reduced chance to get a ban from websites;
- The possibility to choose the necessary region without worrying that it will soon be ‘dead‘;
- In case of malfunction, it is possible to address customer service.