Google to Stop Supporting the Unofficial Noindex Robots.txt Directive

Ryuzaki

お前はもう死んでいる
Moderator
BuSo Pro
Digital Strategist
Joined
Sep 3, 2014
Messages
6,246
Likes
13,132
Degree
9
Source: https://webmasters.googleblog.com/2019/07/a-note-on-unsupported-rules-in-robotstxt.html

Heads up. There's a bit of bad information that floats around about using 'noindex' in your robots.txt file. Lots of people do it, and while it was never an officially supported directive for that file, sometimes Google would respect it as would other crawlers.

Google, in an attempt to start unifying and getting everyone on the same page with robots.txt, is open sourcing their parser for it.

This also coincides with them completely dropping support for noindex in the robots.txt. So if this is you, then you're going to need to use one of the other acceptable methods to keep a page out of the index:
  1. Noindex in robots meta tags
  2. 404 and 410 HTTP status codes
  3. Password protection
  4. Disallow in robots.txt
  5. Search Console Remove URL tool
But note, #3 and #4 can still result in your page being in the index if the page is still discoverable. #5 will result in the page still being indexed, though not showing in the SERPs. Those alone are not sufficient methods to noindex a page. #1 and #2 are the only workable methods for publicly discoverable pages. #3 and #4 just control crawling, not indexing. #5 is just for vanity.

Why this matters? Panda.
 
Final reminder:

dRJCLlT.png


Google is now sending out Search Console statements if you're using noindex within robots.txt. It was never officially supported and now is flat out not supported. You're going to be a sad camper if this is you and you ignore these warnings.

I know many avoid all Google products on their sites so I want to get the message out there. Beware!
 
Back