How to make sure that a robot belongs to Yandex

Some robots can pretend to be Yandex robots by using the relevant User-agent. You can check the authenticity of a robot using a reverse DNS lookup.

To do this, follow these steps:

  1. Determine the IP address of the User-agent in question from your server logs.
  2. Use a reverse DNS lookup of the IP address to determine the host domain name.
  3. Check whether the host belongs to Yandex. All Yandex robot names end in yandex.ru, yandex.net, or yandex.com. If the host name has a different ending, the robot does not belong to Yandex.
  4. Make sure that the name is correct. Use a forward DNS lookup to get the IP address corresponding to the host name. It should match the IP address used in the reverse DNS lookup. If the IP addresses don't match, the host name is fake.
    1. Yandex robots in server logs
    2. FAQ

Yandex robots in server logs

Some Yandex robots download web documents for purposes other than indexing. To avoid unintentional blocking by site owners, they may ignore the restrictive directives of the file robots.txt intended for arbitrary robots (User-agent: *).

In addition, robots may ignore some robots.txt restrictions for certain sites if there is an agreement between «Yandex» and site owners.

Примечание. If such a robot downloads a document that the main Yandex robot can't access, this document will never be indexed and won't be included in search results.

To restrict such robots from accessing the site, use directives meant for them specifically. For example:

User-agent: YandexCalendar
Disallow: /

User-agent: YandexMobileBot
Disallow: /private/*.txt$

Robots use an autonomous networkautonomous network: AS13238 and AS208722, as well as IP addresses that change frequently, so their list isn't disclosed.

When the robot accesses the page, your server logs may display the User-agent and version of the browser used for crawling the site. For example, Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.268.

Примечание. The browser version may change, so we recommend not specifying the version when searching for the User-agent in server logs.
The robot's full name, including the User-agent Purpose of the robot Takes into account the general rules specified in robots.txt
Mozilla/5.0 (compatible; YandexAccessibilityBot/3.0; +http://yandex.com/bots)

Downloads pages to check their accessibility to users.

It sends up to 3 requests to the site per second. The robot ignores the setting in the Yandex Webmaster interface.

No
Mozilla/5.0 (compatible; YandexAdNet/1.0; +http://yandex.com/bots) The Yandex Advertising Network robot. Yes
Mozilla/5.0 (compatible; YandexBlogs/0.99; robot; +http://yandex.com/bots) The blog search robot that indexes post comments. Yes
Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) The main indexing robot. Yes
Mozilla/5.0 (compatible; YandexBot/3.0; MirrorDetector; +http://yandex.com/bots) Detects site mirrors. Yes
Mozilla/5.0 (compatible; YandexCalendar/1.0; +http://yandex.com/bots) The Yandex Calendar robot. Downloads calendar files by users' requests. These files are often located in directories restricted for indexing. No
Mozilla/5.0 (compatible; YandexDialogs/1.0; +http://yandex.com/bots) Sends requests to Alice's skills. No
Mozilla/5.0 (compatible; YandexDirect/3.0; +http://yandex.com/bots) Downloads information about the content of Yandex Advertising Network partner sites to identify their categories to select relevant ads. No
Mozilla/5.0 (compatible; YandexDirectDyn/1.0; +http://yandex.com/bots Generates dynamic banners. No
Mozilla/5.0 (compatible; YandexFavicons/1.0; +http://yandex.com/bots) Downloads the site's favicon file to display it in search results. No
Mozilla/5.0 (compatible; YaDirectFetcher/1.0; Dyatel; +http://yandex.com/bots) Downloads landing pages of ads to check their availability and clarify category. This is necessary for ad placement in search results and on partner sites. No. The robot doesn't use the file robots.txt and ignores the directives set for it.
Mozilla/5.0 (compatible; YandexForDomain/1.0; +http://yandex.com/bots) The domain mail robot used to verify domain ownership rights. Yes
Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots) Indexes images to display them on Yandex Images. Yes
Mozilla/5.0 (compatible; YandexImageResizer/2.0; +http://yandex.com/bots) The mobile services robot. Yes
Mozilla/5.0 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B411 Safari/600.1.4 (compatible; YandexMobileBot/3.0; +http://yandex.com/bots) Defines pages with layout suitable for mobile devices. No
Mozilla/5.0 (compatible; YandexMarket/1.0; +http://yandex.com/bots) The Yandex Market robot. Yes
Mozilla/5.0 (compatible; YandexMarket/2.0; +http://yandex.com/bots) No
Mozilla/5.0 (compatible; YandexMedia/3.0; +http://yandex.com/bots) Indexes multimedia data. Yes
Mozilla/5.0 (compatible; YandexMetrika/2.0; +http://yandex.com/bots yabs01) Downloads site pages to check their availability, including landing pages of Yandex Direct ads. No. The robot doesn't use the file robots.txt and ignores the directives set for it.
Mozilla/5.0 (compatible; YandexMetrika/2.0; +http://yandex.com/bots) The Yandex Metrica robot. No
Mozilla/5.0 (compatible; YandexMetrika/3.0; +http://yandex.com/bots) No
Mozilla/5.0 (compatible; YandexMetrika/4.0; +http://yandex.com/bots) The Yandex Metrica robot. Downloads and caches CSS styles to render site pages in Session Replay. No. The robot doesn't use the file robots.txt and ignores the directives set for it.
Mozilla/5.0 (compatible; YandexMobileScreenShotBot/1.0; +http://yandex.com/bots) Takes a screenshot of a mobile page. No
Mozilla/5.0 (compatible; YandexNews/4.0; +http://yandex.com/bots) The Yandex News robot. Yes
Mozilla/5.0 (compatible; YandexOntoDB/1.0; +http://yandex.com/bots) The information card robot. Yes
Mozilla/5.0 (compatible; YandexOntoDBAPI/1.0; +http://yandex.com/bots) The information card robot that downloads dynamic data. No
Mozilla/5.0 (compatible; YandexPagechecker/1.0; +http://yandex.com/bots) Accesses the page when validating microdata via the Structured data validator form. Yes
Mozilla/5.0 (compatible; YandexPartner/3.0; +http://yandex.com/bots) Downloads information about the content of Yandex partner sites. No
Mozilla/5.0 (compatible; YandexRCA/1.0; +http://yandex.com/bots) Collects data to generate previews. For example, for the extended display of sites in Yandex Search. No
Mozilla/5.0 (compatible; YandexRenderResourcesBot/1.0; +http://yandex.com/bots)

Loads resources for page rendering with JavaScript.

Ignores instructions in robots.txt if the HTML page where these resources are hosted can be accessed by the Yandex robot.

The robot doesn't access the resources if the HTML pages where these resources are used are restricted in robots.txt.

No
Mozilla/5.0 (compatible; YandexSearchShop/1.0; +http://yandex.com/bots) Downloads YML files of product catalogs by users' requests. These files are often located in directories restricted for indexing. No
Mozilla/5.0 (compatible; YandexSitelinks; Dyatel; +http://yandex.com/bots) Checks the availability of pages used as sitelinks. Yes
Mozilla/5.0 (compatible; YandexSpravBot/1.0; +http://yandex.com/bots) The Yandex Business robot. Yes
Mozilla/5.0 (compatible; YandexTracker/1.0; +http://yandex.com/bots) The Yandex Tracker robot. No
Mozilla/5.0 (compatible; YandexTurbo/1.0; +http://yandex.com/bots) Crawls the RSS feed created to generate Turbo pagesTurbo pages. It sends up to 3 requests to the site per second. The robot ignores the setting in the Yandex Webmaster interface and the Crawl-delay directive. Yes
Mozilla/5.0 (compatible; YandexUserproxy; robot; +http://yandex.com/bots) Proxies user actions in Yandex services. For example, sends requests in response to button clicks and downloads pages for online translation. No
Mozilla/5.0 (compatible; YandexVertis/3.0; +http://yandex.com/bots) The search verticals robot. Yes
Mozilla/5.0 (compatible; YandexVerticals/1.0; +http://yandex.com/bots) The Yandex Classifieds robot: Auto.ru, Yandex Realty, Yandex Jobs, and Yandex Reviews. Yes
Mozilla/5.0 (compatible; YandexVideo/3.0; +http://yandex.com/bots) Indexes videos to display them in the Yandex video search. Yes
Mozilla/5.0 (compatible; YandexVideoParser/1.0; +http://yandex.com/bots) Indexes videos to display them in the Yandex video search. No
Mozilla/5.0 (compatible; YandexWebmaster/2.0; +http://yandex.com/bots) The Yandex Webmaster robot. Yes
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z* Safari/537.36 (compatible; YandexScreenshotBot/3.0; +http://yandex.com/bots) Takes a screenshot of the page. No
"YandexAdditionalBot", "YandexAdditionalBot/1.0", UserAgentFrom, "Mozilla/5.0 (compatible; YandexAdditionalBot/1.0; +http://yandex.com/bots)" Taken into account when processing robots.txt to restrict page content display in quick answers with YandexGPT. Applied to pages indexed by the main search indexing robot. Doesn't make indexing requests. No
"YandexAdditional", "YandexAdditional/1.0", UserAgentFrom, "Mozilla/5.0 (compatible; YandexAdditional/1.0; +http://yandex.com/bots)"
Mozilla/5.0 (compatible; YandexComBot/3.0; +http://ya.cc/bots) It indexes content for the non-Russian-speaking segment of the search. It can index content if an explicit ban has not been specified for it. No

* The combination of W.X.Y.Z characters is a placeholder for the Chrome browser version used by the user agent. For example: 41.0.2272.96.

FAQ

How do I protect myself from fake robots that pretend to be Yandex robots?

To protect yourself from fake robots, use the reverse DNS lookup filter, as described above. We recommend this method instead of managing access by IP addresses, as it is more resistant to changes in the Yandex internal networks.

There's too much traffic going back and forth between my web server and your robot. Does Yandex support downloading of compressed pages?

Yes, it does. Each time the Yandex robot requests a page, it says: “Accept-Encoding: gzip,deflate” . This means you can set up your web server to reduce the traffic between the server and our robot. However, note that sending compressed content increases CPU usage on your server. If it's overloaded, it can cause problems. To download gzip and deflate, the robot applies the rfc2616 standard, section 3.5 .