Duplicate pages
If website pages are available at different addresses, but have the same content, the Yandex robot may consider them duplicates and merge them into a group of duplicates.
If your website has duplicate pages:
- The page you need may disappear from search results, if the robot has selected another page from the group of duplicates.
- In some cases, if there are GET parameters, pages may not be grouped and may participate in the search as different documents. As a result, they compete with each other. This may impact the website's ranking in search results.
- Depending on which page remains in the search, the address of the document may change. This may affect, for example, the reliability of statistics in web analytics services.
- It takes the indexing robot longer to crawl the website's pages, which means the data about pages that are important to you is sent to the search database more slowly. The robot can also create an additional load on your website.
How to determine if your website has duplicate pages
Duplicate pages appear for a variety of reasons:
- Natural. For example, if a page with a product description is available in several categories of an online store).
- Related to the site features or its CMS.
- In Yandex.Webmaster, go to the Searchable pages page and select Excluded pages.
- Click the icon and select the “Deleted: Duplicate” status.
You can also download the archive. To do this, choose the file format at the bottom of the page. In the file, duplicate pages have the DUPLICATE status. Learn more about statuses
If the duplicates were created because GET parameters were added to the URL, a notification about this will appear on the Troubleshooting page in Yandex.Webmaster.
How to get rid of duplicate pages
To have the right page in the search results, indicate it for the Yandex robot. This can be done in several ways depending on the URL type.
Example for a regular site:
http://example.com/page1/ and http://example.com/page2/
In this case:
Set up a 301 redirect from one duplicate page to another. In this case, the target of the redirect will be included in the search results.
Specify the preferred (canonical) URL for the page to be included in the search.
Example for a site with AMP pages:
http://example.com/page/ and http://example.com/AMP/page/
In this case add the Disallow directive in the robots.txt file to prevent the duplicate page indexing.
https://example.com and https://example.com/index.php
In this case:
Set up a 301 redirect from one duplicate page to another. In this case, the target of the redirect will be included in the search results.
Specify the preferred (canonical) URL for the page to be included in the search.
http://example.com/page/ and http://example.com/page
In this case, set up a 301 redirect from one duplicate page to another. In this case, the target of the redirect will be included in the search results.
http://example.com/page/, http://example.com/page?id=1 and http://example.com/page?id=2
- Add the Clean-param directive in the robots.txt file so that the robot doesn't take the URL parameters into account. If Yandex.Webmaster shows a notification about page duplication because of GET parameters, this method will fix the error. The notification will disappear when the robot learns about the changes.
- Specify the preferred (canonical) URL for the page to be included in the search.
http://example.com/page?utm_source=link&utm_medium=cpc&utm_campaign=new and http://example.com/page?utm_source=instagram&utm_medium=cpc
In this case, add the Clean-param directive to the robots.txt file so that the robot ignores the parameters in the URL.
The robot learns about changes the next time it visits your site. As soon as that happens, the page that shouldn't be included in the search will be excluded from it within three weeks. If the site has many pages, this may take longer.
You can check that the changes have come into effect in the Pages in search section of Yandex.Webmaster.