In the new episode of Ask Google Webmasters series, John Mueller shares his thoughts on whether you should block special files in your robots.txt file. In this article, we cover all the highlights of the episode in detail, read on to find out!
Here is the question asked to John Mueller
“Regarding robots.txt, should I ‘disallow: /*.css$’, ‘disallow: /php.ini’, or even ‘disallow: /.htaccess’?”
The user’s question is straightforward on whether he should block special files such as CSS, PHP configuration file (.ini) and Server configuration file (.htaccess) in robots.txt?
John Mueller’s Response
Mueller started off by saying that Google cannot stop the website owners from doing so, but it’s a bad idea to block these files in your robots.txt. Blocking special files in some cases makes no sense at all. Here is what happens when you block the files mentioned in the question
Blocking CSS Files
The command Disallow:/*.css$ will block all the URLs that end with a .css extension on your website. Website owners may worry that their CSS files may get indexed by Google and hence end up blocking them in their robots.txt file.
As per John Mueller, this is an illegal practice. Googlebot needs to crawl CSS files to render your pages properly. It helps Google understand whether your page is mobile-friendly or not. If you block your CSS files, Googlebot won’t be able to crawl and understand your page properly which might result in poor rankings.
Mueller emphasized that CSS files won’t get indexed by Google, but it does need to crawl them. Hence, do not block CSS files in your robots.txt.
Blocking PHP.ini File
PHP.ini is a configuration file for PHP which should not be accessible to anyone. It is a special file and should be locked down in a secret location. If no one (including search engine bots) can access it, there’s no point in blocking it in robots.txt file as it will be redundant.
Googlebot won’t be able to discover and crawl them anyway so you can avoid blocking php.ini files in robots.txt.
Blocking .htaccess file
Same as PHP.ini .htaccess is also a special control file that is locked down externally by default. It cannot be accessed externally by anyone and hence just like other special files it is not necessary to block it in robots.txt file.
Googlebot cannot access your .htaccess file by default and hence blocking it in robots.txt is redundant. John Mueller suggests avoiding this practice.
Key Takeaways
John Mueller ended the session with a bold statement saying that site owners should not reuse or copy robots.txt file of some other website and assume that it will work. Webmasters must analyze and understand which areas of the website need to be blocked and only then disallow them in robots.txt. To put this in a nutshell, block resources that make sense and not just randomly create a robots.txt file just for the sake of it. Robots.txt is a small but sensitive file, and any errors in it might lead to a nightmare.
Do you agree with Mueller’s suggestions? Let us know in the comments section below.
Popular Searches
How useful was this post?
0 / 5. 0