Canonical URLs
If a site has a page available at multiple URLs, or pages with identical or similar content, the Yandex robot may count them as duplicates. In this case, it will combine the pages in a group of duplicates and choose one of them, the most informative and relevant to the search query, to be displayed in the search results. Such page is called canonical.
You can use the rel="canonical" attribute to specify which page to show in the search results. You can also specify the canonical URL to change the site address to a domain with or without the www
prefix.
If you want to change the main site address when moving from HTTP to HTTPS, use the 301/302 redirect. The attribute rel="canonical"
for addresses of this format is no longer supported. For more information, see the article Disabling rel="canonical" support when moving from HTTPS to HTTP in the Yandex blog for webmasters.
Alert
The Yandex robot interprets references to the canonical address as recommendations and can ignore them in several cases.
How do I specify the canonical URL of a page?
Add the canonical URL in the rel="canonical"
attribute using one of the following methods:
For example, the page is accessible at two addresses: www.example.com/pages?id==2
and www.example.com/blog
.
If the preferred address is /blog
, add the link
element to the HTML code of the /pages?id=2
page:
<link rel="canonical" href="http://www.example.com/blog"/>
Let's say a site has a PDF file available at multiple URLs: www.example.com/offer/file.pdf
and www.example.com/files/file.pdf
. If the preferred URL is /offer/file.pdf
, configure the server to pass the following in the HTTP header of the /files/file.pdf
page:
Link: <http://www.example.com/offer/file.pdf>; rel="canonical"
Note
Specify the canonical URL within the same domain. Set the canonical URL as the absolute path, for example http://example.com/blog/
.
The page with the rel="canonical"
attribute that points to another URL is considered non-canonical.
The robot learns about the changes when it crawls the site. If the canonical URL is entered correctly and the robot doesn't ignore the instructions, the non-canonical page disappears from the search results. To see if the page was removed from search results, go to Indexing → Searchable pages (Excluded pages section) in Yandex Webmaster.
The robot ignores instructions if the contents of the canonical and non-canonical page significantly differ. In this case, a non-canonical page may be included in the search. To check this, go to Indexing → Searchable pages.
To exclude a non-canonical page that contains GET parameters or tags (UTM, from, and so on) in the URL, add the Clean-param directive in the robots.txt file. Otherwise, use the Disallow directive.
How to change the site address URL using the canonical URL
You can specify the canonical URL to change the site address to a domain with or without the www
prefix.
The bot will interpret the canonical address as a redirect to the new main site address and group the two site versions. To do this, add a link to the new site page with the rel="canonical"
attribute in the HTML code or in the HTTP header of all old site pages. For example, you want to change https://www.example.com
to https://example.com
. On the page https://example.com/main/
, you'll need to specify:
<link rel="canonical" href="https://example.com/main"/>
If the attribute points to a different page, the robot might consider this a difference in the site structure. In this case, the site can't be moved. If the attribute is only added to some pages, it will not point to the main site address.
If you change the URL, make sure that the contents match on the old site and new site. For more information, see relocation instructions.
Tip
To improve the chances that both users and indexing robots will land on the preferred version of your site, use the 301/302 redirect.
Cases where the canonical address isn't taken into account
The Yandex robot doesn't consider the URL canonical if:
-
At the time of crawling, non-canonical pages respond more fully to the user's request, and their content differs significantly from the canonical ones. If you are sure that such pages won't be useful in search, prohibit indexing in the robots.txt file.
-
The canonical URL is not accessible to the robot — it redirects to another page or is closed from indexing. This means it can't be included in the search. In this case, a non-canonical URL can be included in the search instead of the canonical URL, provided the robot can access it.
-
The canonical URL points to another domain or subdomain.
-
Several canonical URLs are specified.
-
A chain of canonical URLs is specified. For example, for
example.ru/1
, the canonical URL isexample.ru/2
. At the same time,example.ru/2
has the canonical URLexample.ru/3
.
Questions and answers
The rel="canonical" attribute points to the page it's located on. Is this an error?
No. If the rel="canonical"
attribute on the page refers to this page, the robot considers it canonical.
How do I reinclude a non-canonical page in searches?
If a page was excluded from search results for being non-canonical, it means that the robot found the rel="canonical"
attribute with the canonical URL in its HTML code or HTTP header. Delete this reference and check that the page you want to include back in the search is not closed to indexing.