I have noticed that Bing bot doesn't follow robots.txt rules Because i disallowed all bots but Bing bot doesn't follow the rules I block some bots using .htaccess is there a code to Block all Bots?
Asked
Active
Viewed 7,800 times
2
-
2[Bing documentation](https://www.bing.com/webmaster/help/how-to-create-a-robots-txt-file-cb7c31ec) would seem to indicate that real Bing bots *do* follow robots.txt rules - but the problem is, the only way you know some request is from a bot (or a *particular* bot) is if the sender of the request chooses to say so. A non-nice sender can always choose to tell lies instead. – telcoM Oct 12 '19 at 11:29
-
I use cloudflare im sure it is bing bot i checked the ip too, and i was just created the website that's why i blocked all bots to prevent them index it before design it, so is there a code to block all bots ? i use one but i have to type all names it should be code like ^bot – Patricia Smith Oct 12 '19 at 20:44
-
following telcoM's comment, I also see there's a difference between crawling and [indexing](https://www.bing.com/webmaster/help/how-can-i-remove-a-url-or-page-from-the-bing-index-37c07477) (and indexing as well as un-indexing requires crawling) and there are different methods to prevent each (but again, crawling must be (re)allowed to prevent indexing...) – A.B Oct 13 '19 at 13:08
-
I wanna prevent all bots of crawling the website , there is a code to set it in .htaccess but don't remember it – Patricia Smith Oct 13 '19 at 18:13
-
1I'm voting to close this question as off-topic because the question is about Bing. – sebasth Oct 14 '19 at 18:48
-
The question is not about Bing but about how to block all bots via .htaccess , learn how to read , i know that there are a lot of psycho people down vote the others questions without reason here but you are worst one – Patricia Smith Oct 15 '19 at 07:29
1 Answers
3
All robots ought to be blocked by /robots.txt (not by .htaccess), like this:
# cat robots.txt
User-agent: *
Disallow: /
The file needs to be in the document root and world readable. Check by opening it in a web browser: http://yourdomain/robots.txt should give the file contents.
Robots may technically chose not to follow this but really ought to. Bing does I am sure.
If for some reason (unlikely with the actual Bing) this does not work, try
# cat .htaccess
SetEnvIfNoCase User-Agent .*bot.* search_robot
SetEnvIfNoCase User-Agent .*bing.* search_robot
SetEnvIfNoCase User-Agent .*crawl.* search_robot
Order Deny,Allow
Deny from env=search_robot
Allow from All
You need to enable the mod_setenvif apache module for this, please see http://www.askapache.com/htaccess/setenvif.html
Ned64
- 8,486
- 9
- 48
- 86
-
Thank you, Bing doesn't follow robots.txt , the website was pretty new just created it and blocked all bots in robots.txt but Bing crawled it many times i checked the IP address it is bing not spoof user agent – Patricia Smith Oct 17 '19 at 14:01