Would someone at Microsoft please explain why something from IP 209.249.11.4 is crawling this site extensively and using up huge quantities of bandwidth while passing a clearly faked “blogger.com” referrer field in the headers?
This is behavior I usually associate with malicious spam and scraper bots, and indeed in the past I have had to block an IP from which a bot with user-agent “MSRBOT” for being over-aggressive. I had thought at first that it was a script kiddie using an authentic-sounding name to mask his malware, so imagine my dismay to find that MSRBOT is not only coming from an official “Microsoft Research” server this time, but is also rather ill-mannered, going through a different file every 3 minutes and deceptively leaving in its wake a referrer spoofing a competitor’s URL. (At least they’re giving that 3 minute pause rather than going all out without a delay, but still, it makes a mess of my referrer log stream and is not the least bit ethical.)
Of course, it’s entirely possible that the Microsoft server in question has been cracked by a malware/botnet operator, which wouldn’t surprise me all that much. Blocked and blacklisted.
DomainTools has noticed it too. Also see the official MSRBOT FAQ, which makes no mention of the referrer URL spoofing.