How is the file robots.txt checked?
The analyzer loads the file robots.txt automatically from root directory of your web-site into “robots.txt” text box.
When you click the “Check” button, the analyzer will parse the contents of “robots.txt” field line by line and analyze the directives they contain. You can also find out whether the robot is going to traverse the pages specified in the “URL list” text box.
By editing the rules, you can make a robots.txt file meeting the needs of your web-site. Do not forget that these actions do not change the file that resides on your website. For changes to take effect, you need to load the updated file robots.txt into your site manually.
In the sections intended for Yandex robots (
User-agent: Yandex or
User-agent:*), the analyzer checks the directives following the rules for using robots.txt. For all the other sections, the checking is performed according to the standard.
When parsing the file, the analyzer announces the errors it finds and warns about inaccuracies in the rules formulation. It also lists the file sections intended for Yandex robot. The result of the analysis is shown at the bottom of the “Robots.txt analysis” page.
Why does the analyzer return the "This URL does not belong to your domain" error message?
Most likely, you indicated an address for one of your site mirrors in the URL list, e.g. http://site.com instead of http://www.site.com (technically, these are two different URLs). The URLs being checked must belong to the site for which the robots.txt file is being analyzed.
Analysis of robots.txt discovered errors. How can I find out what caused them?
The analyzer returns two types of messages, error messages and warnings. They are described in detail in Robots.txt analysis error manual.
If the analyzer cannot process a certain line, section or the entire file because of severe syntax errors made when composing robots.txt, this is an error situation.
A warning is usually a notification about a rules violation that the analyzer cannot fix automatically, or it may indicate a potential problem that may not exist in reality, due to a random typo or an inaccuracy in rules formulation.