Pages in search results
The Yandex search results are regularly updated. Your site pages can appear in the search results and disappear from them.
With the information provided in
in Yandex.Webmaster, you can:- Find out the date of the last robot crawl and the search results update.
- Find out why a page was excluded from the search results.
- Learn the status of new website pages. Even if the pages have been indexed and included in the search immediately after publication, the data in Yandex.Webmaster may be updated with a delay. Usually, it takes a few days for the changes to appear.
By default, the service provides data on the site as a whole. To view the information about a certain section, choose it from the list in the site URL field. Available sections reflect the site structure as known to Yandex (except for the manually added sections).
Page status dynamics
Page information is presented as follows:
- Added and removed — The ratio of the pages included and excluded from the search.
- Excluded — The dynamics of pages excluded from the search.
- History — The dynamics of pages included in the search. Each segment of the graph corresponds to a site section.
- Dictribution — The number of pages included in the search. Each segment of the pie chart corresponds to a site section.
The total number of pages included in the search can exceed the total number of pages in the diagrams (only pages and sections listed on the Site structure page in Yandex.Webmaster are counted).
Page changes in the search results
Yandex.Webmaster informs you of changes in the search results over the past three months:
- Date of the search results update.
- Whether a page was added to in the search results (the page URL is displayed in green) or removed from them (displayed in blue).
- The reason why the page was removed from the search.
A dramatic change in the number of pages added or participating in search might be related to changes on the site. For example, with changing the website structure or the robots.txt file.
In addition, the service displays general information about the page:
- The date of the last robot crawl.
- The page path from the root directory of the site.
- The page title (the title element).
To view the changes, set the option to Recent changes. Up to 50,000 changes can be displayed.
List of pages included in the search
You can view the list of pages included in the search and the following information about them:
- The date of the last robot crawl.
- The page path from the root directory of the site.
- Page title (the HTML title element).
- Availability of Turbo pagesTurbo pages for the URL.
To view the list of pages, set the option to All pages. The list can contain up to 50,000 pages.
If the list doesn't contain any pages that should be included in the search results, use the Reindex pages tool to inform Yandex about them.
If the list contains pages that shouldn't be included in the search results, see the Deleting a site section.
The list of pages excluded from the search
Site pages can disappear from the search results. For more information about why this may happen, see the Why are pages excluded from the search? section.
To view the list of excluded pages (up to 50,000), set the option to Excluded pages in . The following information about the pages is displayed:
- The date of the last robot crawl.
- The page path from the root directory of the site.
- Status (the reason why the page was excluded).
A page is no longer listed as excluded from the search some time after the following conditions are fulfilled:
- The page is unavailable for the indexing robot (HTTP server returns 404 Not Found), or there is a 301 redirect set on the page.
- Other resources don't refer to the excluded page.
A page that is removed from the site or doesn't exist can be found in the list of pages excluded from the search, probably because other resources refer to it. To remove the page from the list, disallow indexing in the robots.txt file.
Data filtering
You can quickly find information about a page using filters. To do this, click the icon. You can filter data by all available parameters. For example, you can use URL filtering:
- Match any of the conditions (corresponds to the “OR” operator).
- Match all conditions (corresponds to the “AND” operator).
To make a list of pages with a certain text in the URL, choose URL contains from the list and enter the URL fragment.
You can use special characters to match the beginning of the string or a substring, and set more complex conditions using regular expressions. To do it, choose URL matches from the list and enter the condition in the field. You can add multiple conditions by putting each of them on a new line.
For conditions, the following rules are available:
Character | Description | Example |
---|---|---|
* | Matches any number of any characters | Display data for all pages that start with https://example.com/tariff/, including the specified page: Using the * character The * character can be useful when searching for URLs that contain two specific elements or more. For example, you can find news or announcements for a certain year: |
@ | The filtered results contain the specified string (but don't necessarily strictly match it) | Display information for all pages with URLs containing the specified string: @tariff |
~ | Condition is a regular expression | Display data for pages with URLs that match a regular expression. For example, you can filter all pages with address containing the fragment ~table|sofa|bed repeated once or several times. |
! | Negative condition | Exclude pages with URLs starting with https://example.com/tariff/: !/tariff/* |
Character | Description | Example |
---|---|---|
* | Matches any number of any characters | Display data for all pages that start with https://example.com/tariff/, including the specified page: Using the * character The * character can be useful when searching for URLs that contain two specific elements or more. For example, you can find news or announcements for a certain year: |
@ | The filtered results contain the specified string (but don't necessarily strictly match it) | Display information for all pages with URLs containing the specified string: @tariff |
~ | Condition is a regular expression | Display data for pages with URLs that match a regular expression. For example, you can filter all pages with address containing the fragment ~table|sofa|bed repeated once or several times. |
! | Negative condition | Exclude pages with URLs starting with https://example.com/tariff/: !/tariff/* |
The use of characters isn't case sensitive.
The @,!, ~ characters can be used only at the beginning of the string. The following combinations are available:
Operator | Example |
---|---|
!@ | Exclude pages with URLs containing tariff: !tariff |
!~ | Exclude pages with URLs that match the regular expression |
Operator | Example |
---|---|
!@ | Exclude pages with URLs containing tariff: !tariff |
!~ | Exclude pages with URLs that match the regular expression |
Downloading information in a file
You can download the pages information in the XLS or CSV format.
The file can contain the following data:
- updateDate — The date of the search database update.
- url — The page URL.
- httpCode — The HTTP code received by the robot during the recent crawl.
- status — Page status.
- target — The URL where the page redirects, or the URL displayed in the search results.
- lastAccess — The date when the page was last crawled by the robot.
- title — Page title (the content of the title HTML element).
- event — The event that happened to the page (whether it was included or excluded in the search).
Page status in the web interface | Page status in the file | Description | Recommendations |
---|---|---|---|
Low-value or low-demand page | LOW_DEMAND | The algorithm decided not to include the page in search results because demand for the page is probably low. For example, this can happen if there's no content on the page, if the page is a duplicate of pages already known to the robot, or if its content doesn't completely suit user interests. The algorithm automatically checks the pages on a regular basis, so the decision may change later. To learn more, see Low-value or low-demand pages. | To learn more, see Low-value or low-demand pages. |
Excluded by Clean-param | CLEAN_PARAMS | The page was excluded from the search after the robot processed the Clean-param directive. | To get the page indexed, edit the robots.txt file. |
Duplicate | DUPLICATE | The page duplicates a site page that is already in the search. | Specify the preferred URL for the robot using a 301 redirect or the rel="canonical" attribute. If the content of the pages differs, send them for reindexing to speed up the search database update. |
Server connection error | HOST_ERROR | When trying to access the site, the robot could not connect to the server. | Check the server response, make sure that the Yandex robot isn't blocked by the hosting provider. The site is indexed automatically when it becomes available for the robot. |
HTTP error | HTTP_ERROR | An error occurred while accessing the page. | If the problem persists, contact your site administrator or the server administrator. If the page is available at the moment, send it for reindexing. |
Prohibited by the noindex element. | META_NO_INDEX | The page was excluded from the search because it is prohibited from indexing (with the robots meta tag that contains the content="noindex" or content="none" directive). | To get the page displayed in the search, remove the ban and send it for reindexing. |
Non-canonical | NOT_CANONICAL | The page is indexed by the canonical URL specified in the rel="canonical" attribute in its source code. | Correct or delete the rel="canonical" attribute if it is specified incorrectly. The robot will track the changes automatically. To speed up the page information update, send the page for reindexing. |
Secondary mirror | NOT_MAIN_MIRROR | The page belongs to a secondary site mirror, so it was excluded from the search. | |
Unknown status | OTHER | The robot has no up-to-date data on the page. | Check the server response or prohibiting HTML elements. If the page can't be accessed by the robot, contact the administrator of your site or server. If the page is available at the moment, send it for reindexing. |
Couldn't download page | PARSER_ERROR | When trying to access the page, the robot couldn't get its content. | Check the server response or prohibiting HTML elements. If the problem persists, contact your site administrator or the server administrator. If the page is available at the moment, send it for reindexing. |
In search | REDIRECT_SEARCHABLE | The page redirects to another page but is included in the search. | |
Redirect | REDIRECT_NOTSEARCHABLE | The page redirects to another page. The target page is indexed. | Check the indexing of the target page. |
Disallowed in robots.txt (entire site) | ROBOTS_HOST_ERROR | Site indexing is prohibited in the robots.txt file. The robot will automatically start crawling the page when the site becomes available for indexing. | If necessary, make changes to the robots.txt file. |
Disallowed robots.txt (page) | ROBOTS_TXT_ERROR | Site indexing is prohibited in the robots.txt file. The robot will automatically start crawling the page when the site becomes available for indexing. | If necessary, make changes to the robots.txt file. |
In search | SEARCHABLE | The page is included in search and can be displayed in search results for queries. |
Page status in the web interface | Page status in the file | Description | Recommendations |
---|---|---|---|
Low-value or low-demand page | LOW_DEMAND | The algorithm decided not to include the page in search results because demand for the page is probably low. For example, this can happen if there's no content on the page, if the page is a duplicate of pages already known to the robot, or if its content doesn't completely suit user interests. The algorithm automatically checks the pages on a regular basis, so the decision may change later. To learn more, see Low-value or low-demand pages. | To learn more, see Low-value or low-demand pages. |
Excluded by Clean-param | CLEAN_PARAMS | The page was excluded from the search after the robot processed the Clean-param directive. | To get the page indexed, edit the robots.txt file. |
Duplicate | DUPLICATE | The page duplicates a site page that is already in the search. | Specify the preferred URL for the robot using a 301 redirect or the rel="canonical" attribute. If the content of the pages differs, send them for reindexing to speed up the search database update. |
Server connection error | HOST_ERROR | When trying to access the site, the robot could not connect to the server. | Check the server response, make sure that the Yandex robot isn't blocked by the hosting provider. The site is indexed automatically when it becomes available for the robot. |
HTTP error | HTTP_ERROR | An error occurred while accessing the page. | If the problem persists, contact your site administrator or the server administrator. If the page is available at the moment, send it for reindexing. |
Prohibited by the noindex element. | META_NO_INDEX | The page was excluded from the search because it is prohibited from indexing (with the robots meta tag that contains the content="noindex" or content="none" directive). | To get the page displayed in the search, remove the ban and send it for reindexing. |
Non-canonical | NOT_CANONICAL | The page is indexed by the canonical URL specified in the rel="canonical" attribute in its source code. | Correct or delete the rel="canonical" attribute if it is specified incorrectly. The robot will track the changes automatically. To speed up the page information update, send the page for reindexing. |
Secondary mirror | NOT_MAIN_MIRROR | The page belongs to a secondary site mirror, so it was excluded from the search. | |
Unknown status | OTHER | The robot has no up-to-date data on the page. | Check the server response or prohibiting HTML elements. If the page can't be accessed by the robot, contact the administrator of your site or server. If the page is available at the moment, send it for reindexing. |
Couldn't download page | PARSER_ERROR | When trying to access the page, the robot couldn't get its content. | Check the server response or prohibiting HTML elements. If the problem persists, contact your site administrator or the server administrator. If the page is available at the moment, send it for reindexing. |
In search | REDIRECT_SEARCHABLE | The page redirects to another page but is included in the search. | |
Redirect | REDIRECT_NOTSEARCHABLE | The page redirects to another page. The target page is indexed. | Check the indexing of the target page. |
Disallowed in robots.txt (entire site) | ROBOTS_HOST_ERROR | Site indexing is prohibited in the robots.txt file. The robot will automatically start crawling the page when the site becomes available for indexing. | If necessary, make changes to the robots.txt file. |
Disallowed robots.txt (page) | ROBOTS_TXT_ERROR | Site indexing is prohibited in the robots.txt file. The robot will automatically start crawling the page when the site becomes available for indexing. | If necessary, make changes to the robots.txt file. |
In search | SEARCHABLE | The page is included in search and can be displayed in search results for queries. |
The file can contain the following data:
- url — The page URL.
- lastAccess — The date when the page was last crawled by the robot.
- title — Page title (the HTML title element).
The file can contain the following data:
- url — The page URL.
- status — Page status.
- lastAccess — The date when the page was last crawled by the robot.
Page status in the web interface | Page status | Description | Recommendations |
---|---|---|---|
Low-value or low-demand page | LOW_DEMAND | The algorithm decided not to include the page in search results because demand for the page is probably low. For example, this can happen if there's no content on the page, if the page is a duplicate of pages already known to the robot, or if its content doesn't completely suit user interests. The algorithm automatically checks the pages on a regular basis, so the decision may change later. To learn more, see Low-value or low-demand pages. | To learn more, see Low-value or low-demand pages. |
Excluded by Clean-param | CLEAN_PARAMS | The page was excluded from the search after the robot processed the Clean-param directive. | To get the page indexed, edit the robots.txt file. |
Duplicate | DUPLICATE | The page duplicates a site page that is already in the search. | Specify the preferred URL for the robot using a 301 redirect or the rel="canonical" attribute. If the content of the pages differs, send them for reindexing to speed up the search database update. |
Server connection error | HOST_ERROR | When trying to access the site, the robot could not connect to the server. | Check the server response, make sure that the Yandex robot isn't blocked by the hosting provider. The site is indexed automatically when it becomes available for the robot. |
HTTP error | HTTP_ERROR | An error occurred while accessing the page. | If the problem persists, contact your site administrator or the server administrator. If the page is available at the moment, send it for reindexing. |
Prohibited by the noindex element. | META_NO_INDEX | The page was excluded from the search because it is prohibited from indexing (with the robots meta tag that contains the content="noindex" or content="none" directive). | To get the page displayed in the search, remove the ban and send it for reindexing. |
Non-canonical | NOT_CANONICAL | The page is indexed by the canonical URL specified in the rel="canonical" attribute in its HTML code. | Correct or delete the rel="canonical" attribute if it is specified incorrectly. The robot will track the changes automatically. To speed up the page information update, send the page for reindexing. |
Secondary mirror | NOT_MAIN_MIRROR | The page belongs to a secondary site mirror, so it was excluded from the search. | |
Unknown status | OTHER | The page is known to the robot but it isn't included in the search. | Check the server response or prohibiting HTML elements. If the problem persists, contact your site administrator or the server administrator. If the page is available at the moment, send it for reindexing. |
Couldn't download page | PARSER_ERROR | When trying to access the page, the robot couldn't get its content. | Check the server response or prohibiting HTML elements. If the problem persists, contact your site administrator or the server administrator. If the page is available at the moment, send it for reindexing. |
Redirect | REDIRECT_NOTSEARCHABLE | The page redirects to another page. The target page is indexed. | Check the indexing of the target page. |
Disallowed in robots.txt (entire site) | ROBOTS_HOST_ERROR | Site indexing is prohibited in the robots.txt file. The robot will automatically start crawling the page when the site becomes available for indexing. | If necessary, make changes to the robots.txt file. |
Disallowed robots.txt (page) | ROBOTS_TXT_ERROR | Page indexing is prohibited in the robots.txt file. The robot will automatically start crawling the page when the site becomes available for indexing. | If necessary, make changes to the robots.txt file. |
Page status in the web interface | Page status | Description | Recommendations |
---|---|---|---|
Low-value or low-demand page | LOW_DEMAND | The algorithm decided not to include the page in search results because demand for the page is probably low. For example, this can happen if there's no content on the page, if the page is a duplicate of pages already known to the robot, or if its content doesn't completely suit user interests. The algorithm automatically checks the pages on a regular basis, so the decision may change later. To learn more, see Low-value or low-demand pages. | To learn more, see Low-value or low-demand pages. |
Excluded by Clean-param | CLEAN_PARAMS | The page was excluded from the search after the robot processed the Clean-param directive. | To get the page indexed, edit the robots.txt file. |
Duplicate | DUPLICATE | The page duplicates a site page that is already in the search. | Specify the preferred URL for the robot using a 301 redirect or the rel="canonical" attribute. If the content of the pages differs, send them for reindexing to speed up the search database update. |
Server connection error | HOST_ERROR | When trying to access the site, the robot could not connect to the server. | Check the server response, make sure that the Yandex robot isn't blocked by the hosting provider. The site is indexed automatically when it becomes available for the robot. |
HTTP error | HTTP_ERROR | An error occurred while accessing the page. | If the problem persists, contact your site administrator or the server administrator. If the page is available at the moment, send it for reindexing. |
Prohibited by the noindex element. | META_NO_INDEX | The page was excluded from the search because it is prohibited from indexing (with the robots meta tag that contains the content="noindex" or content="none" directive). | To get the page displayed in the search, remove the ban and send it for reindexing. |
Non-canonical | NOT_CANONICAL | The page is indexed by the canonical URL specified in the rel="canonical" attribute in its HTML code. | Correct or delete the rel="canonical" attribute if it is specified incorrectly. The robot will track the changes automatically. To speed up the page information update, send the page for reindexing. |
Secondary mirror | NOT_MAIN_MIRROR | The page belongs to a secondary site mirror, so it was excluded from the search. | |
Unknown status | OTHER | The page is known to the robot but it isn't included in the search. | Check the server response or prohibiting HTML elements. If the problem persists, contact your site administrator or the server administrator. If the page is available at the moment, send it for reindexing. |
Couldn't download page | PARSER_ERROR | When trying to access the page, the robot couldn't get its content. | Check the server response or prohibiting HTML elements. If the problem persists, contact your site administrator or the server administrator. If the page is available at the moment, send it for reindexing. |
Redirect | REDIRECT_NOTSEARCHABLE | The page redirects to another page. The target page is indexed. | Check the indexing of the target page. |
Disallowed in robots.txt (entire site) | ROBOTS_HOST_ERROR | Site indexing is prohibited in the robots.txt file. The robot will automatically start crawling the page when the site becomes available for indexing. | If necessary, make changes to the robots.txt file. |
Disallowed robots.txt (page) | ROBOTS_TXT_ERROR | Page indexing is prohibited in the robots.txt file. The robot will automatically start crawling the page when the site becomes available for indexing. | If necessary, make changes to the robots.txt file. |