Start a conversation

Data Analysis

  1. What is data analysis
  2. Parameters required to start data analysis
  3. Stopping data analysis
  4. Manual start of data analysis

1. What is data analysis

During data analysis Netpeak Spider:

  • Checks for duplicate page and text hashes, h1 tag, title, description and canonical URL.
  • Checks issues considering the restrictions according to custom configurations (Settings → Restrictions)
  • Checks hreflang issues → allows you to analyze the detected links of the language versions, which are set using the hreflang attribute.
  • Counts incoming links → calculates the number of incoming links on all crawled pages. We recommend you to begin this analysis after the crawling is complete.
  • Checks canonical chains → detection of canonical chains.
  • Calculates internal PageRank for each page of the website. The values are displayed in the filtered results column. The calculation of the internal PageRank in the ‘Analyze‘ module is performed without any preliminary settings. To customize the settings, open the ‘Internal PageRank calculation’ tool. 
  • Gets the data from Google Analytics and Google Search Console (GSC) according to the setting in ‘Google Analytics & Search Console‘ tab. 
  • Exports queries from Search Console for each URL.
  • Gets Yandex.Metrica data according to the setting in the ‘Yandex.Metrica‘ tab. 

Data analysis in Netpeak Spider begins once the crawling is stopped or finished. It was implemented this way to save the device resources and speed up the crawling process.

2.  Parameters required to start data analysis

Please note: data analysis is possible if the following parameters are enabled in the sidebar before the crawling:

2.1. For duplicates analysis:

  • ‘Canonical URL‘ parameter in the ‘Crawling and Indexing‘ group. 
  • ‘Title‘ and ‘Description‘ parameters in the ‘Head tags‘ group.

For duplicates analysis

  • ‘H1 Content‘ parameter in the ‘H1-H6 Headings‘ → ‘Content‘ group.
  • ‘Page hash‘ and ‘Text Hash‘ in the ‘Unique Hashes‘ → ‘Content‘ group.

Parameters for duplicates analysis

2.2. For counting incoming links and calculating internal PageRank → ‘Links‘ group.

2.3. For checking canonical chains → ‘Canonical‘ parameter in the ‘Crawling and Indexing‘ group.  

Selection for duplicates analysis

2.4. To get data from Google Analytics & Search Console and export queries from GSC, choose necessary parameters in the ‘Google Analytics‘ and ‘Google Search Console‘ group.  

get data from Google Analytics and Search Console

Please note: to enable the parameter that allows you to export queries from the Google Search Console, you need to tick the necessary checkbox in the ‘Google Analytics and Search Console‘ settings tab.

2.5. To get data from Yandex.Metrica, choose necessary parameters in the ‘Yandex.Metrica‘ group. 

get data from Yandex Metrica

Getting data from Google Analytics, Search Console, and Yandex.Metrics services, and exporting queries from Search Console is possible only if you have accessed sites in these services. 

Learn more how to connect your Google and Yandex accounts in Netpeak Spider in the ‘Getting data from services‘ section.

3. Stopping data analysis

Once the crawling is completed you can cancel data analysis hitting the ‘Cancel‘ button in the pop-up window. All proceeded data will be saved and you can always run data analysis manually (e.g. for recalculation of internal PageRank when you delete some part of results).

Stopping data analysis

4. Manual start of data analysis

To start the analysis manually, choose the ‘Analysis‘ tab and select the data type you need.

Manual start of data analysis

After changing any data that are involved in the analysis, restart it for the updates to take effect. 

Learn more about parameters and data they are responsible for in ‘What SEO parameters does Netpeak Spider check?‘ article. 

Choose files or drag and drop files
Was this article helpful?

Still Thinking?

Thousands of specialists around the world use Netpeak Software products for daily SEO-tasks. Sign up to get free access right now!