- What is data analysis
- Parameters required to start data analysis
- Stopping data analysis
- Manual start of data analysis
1. What is data analysis
During data analysis Netpeak Spider:
- Checks for duplicate page and text hashes, h1 tag, title, description and canonical URL.
- Checks issues considering the restrictions according to custom configurations (Settings → Restrictions)
- Checks hreflang issues → allows you to analyze the detected links of the language versions, which are set using the hreflang attribute.
- Counts incoming links → calculates the number of incoming links on all crawled pages. We recommend you to begin this analysis after the crawling is complete.
- Checks canonical chains → detection of canonical chains.
- Calculates internal PageRank for each page of the website. The values are displayed in the filtered results column. The calculation of the internal PageRank in the ‘Analyze‘ module is performed without any preliminary settings. To customize the settings, open the ‘Internal PageRank calculation’ tool.
- Gets the data from Google Analytics and Google Search Console (GSC) according to the setting in ‘Google Analytics & Search Console‘ tab.
- Exports queries from Search Console for each URL.
- Gets Yandex.Metrica data according to the setting in the ‘Yandex.Metrica‘ tab.
Data analysis in Netpeak Spider begins once the crawling is stopped or finished. It was implemented this way to save the device resources and speed up the crawling process.
2. Parameters required to start data analysis
Please note: data analysis is possible if the following parameters are enabled in the sidebar before the crawling:
2.1. For duplicates analysis:
- ‘Canonical URL‘ parameter in the ‘Crawling and Indexing‘ group.
- ‘Title‘ and ‘Description‘ parameters in the ‘Head tags‘ group.
- ‘H1 Content‘ parameter in the ‘H1-H6 Headings‘ → ‘Content‘ group.
- ‘Page hash‘ and ‘Text Hash‘ in the ‘Unique Hashes‘ → ‘Content‘ group.
2.2. For counting incoming links and calculating internal PageRank → ‘Links‘ group.
2.3. For checking canonical chains → ‘Canonical‘ parameter in the ‘Crawling and Indexing‘ group.
2.4. To get data from Google Analytics & Search Console and export queries from GSC, choose necessary parameters in the ‘Google Analytics‘ and ‘Google Search Console‘ group.
Please note: to enable the parameter to export queries from the Google Search Console, tick the necessary checkbox in the ‘Google Analytics and Search Console‘ settings tab.
Getting data from Google Analytics and Search Console services and exporting queries from Search Console is possible only if you have accessed sites in these services.
Learn more about connecting your Google accounts in Netpeak Spider in the ‘Getting data from services‘ section.
3. Stopping data analysis
Once the crawling is completed, you can cancel data analysis by hitting the ‘Cancel‘ button in the pop-up window. All proceeded data will be saved, and you can always run data analysis manually (e.g. for recalculation of internal PageRank when you delete some part of the results).
4. Manual start of data analysis
To start the analysis manually, choose the ‘Analysis‘ tab and select the data type you need.
After changing any data in the analysis, restart it for the updates to take effect.
Learn more about the parameters and data they are responsible for in the ‘What SEO parameters does Netpeak Spider check?‘ article.