2

I have noticed that Bing bot doesn't follow robots.txt rules Because i disallowed all bots but Bing bot doesn't follow the rules I block some bots using .htaccess is there a code to Block all Bots?

  • 2
    [Bing documentation](https://www.bing.com/webmaster/help/how-to-create-a-robots-txt-file-cb7c31ec) would seem to indicate that real Bing bots *do* follow robots.txt rules - but the problem is, the only way you know some request is from a bot (or a *particular* bot) is if the sender of the request chooses to say so. A non-nice sender can always choose to tell lies instead. – telcoM Oct 12 '19 at 11:29
  • I use cloudflare im sure it is bing bot i checked the ip too, and i was just created the website that's why i blocked all bots to prevent them index it before design it, so is there a code to block all bots ? i use one but i have to type all names it should be code like ^bot – Patricia Smith Oct 12 '19 at 20:44
  • following telcoM's comment, I also see there's a difference between crawling and [indexing](https://www.bing.com/webmaster/help/how-can-i-remove-a-url-or-page-from-the-bing-index-37c07477) (and indexing as well as un-indexing requires crawling) and there are different methods to prevent each (but again, crawling must be (re)allowed to prevent indexing...) – A.B Oct 13 '19 at 13:08
  • I wanna prevent all bots of crawling the website , there is a code to set it in .htaccess but don't remember it – Patricia Smith Oct 13 '19 at 18:13
  • 1
    I'm voting to close this question as off-topic because the question is about Bing. – sebasth Oct 14 '19 at 18:48
  • The question is not about Bing but about how to block all bots via .htaccess , learn how to read , i know that there are a lot of psycho people down vote the others questions without reason here but you are worst one – Patricia Smith Oct 15 '19 at 07:29

1 Answers1

3

All robots ought to be blocked by /robots.txt (not by .htaccess), like this:

# cat robots.txt
User-agent: *
Disallow: /

The file needs to be in the document root and world readable. Check by opening it in a web browser: http://yourdomain/robots.txt should give the file contents.

Robots may technically chose not to follow this but really ought to. Bing does I am sure.

If for some reason (unlikely with the actual Bing) this does not work, try

# cat .htaccess
SetEnvIfNoCase User-Agent .*bot.* search_robot
SetEnvIfNoCase User-Agent .*bing.* search_robot
SetEnvIfNoCase User-Agent .*crawl.* search_robot
Order Deny,Allow
Deny from env=search_robot
Allow from All

You need to enable the mod_setenvif apache module for this, please see http://www.askapache.com/htaccess/setenvif.html

Ned64
  • 8,486
  • 9
  • 48
  • 86
  • Thank you, Bing doesn't follow robots.txt , the website was pretty new just created it and blocked all bots in robots.txt but Bing crawled it many times i checked the IP address it is bing not spoof user agent – Patricia Smith Oct 17 '19 at 14:01