Can somebody with a webspider crawl these documents, and put it up on the web? http://www.whitehouse.gov/robots.txt -- Eugen* Leitl <a href="http://leitl.org">leitl</a> ______________________________________________________________ ICBM: 48.07078, 11.61144 http://www.leitl.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE http://moleculardevices.org http://nanomachines.net [demime 0.97c removed an attachment of type application/pgp-signature]
On Wed, Dec 10, 2003 at 12:56:24PM +0100, Eugen Leitl wrote:
Can somebody with a webspider crawl these documents, and put it up on the web?
All or nearly all of them are duplicates of same documents elsewhere in the directory tree; "X/text/" and "X/iraq/" are supposed to be copies of "X/", with images removed in the first case. I suspect that downloading them all would just confirm that. -- avva
This robots.txt issue was exaggerated by leftist crtitics of the administration. (This is not a general defense of the White House, just a statement of fact.) The Bush WH.gov server has a special Iraq section where press releases, speeches, etc. are reposted in a different HTML template. The WH only wants the "master" copy indexed and not the duplicate copy in the second template. Hence the apparent weirdness in robots.txt. I have not found any skullduggery going on, though I suppose it wouldn't hurt to keep a copy of the Iraq section for "diff" purposes just in case. -Declan On Wed, Dec 10, 2003 at 02:59:07PM +0200, Anatoly Vorobey wrote:
On Wed, Dec 10, 2003 at 12:56:24PM +0100, Eugen Leitl wrote:
Can somebody with a webspider crawl these documents, and put it up on the web?
All or nearly all of them are duplicates of same documents elsewhere in the directory tree; "X/text/" and "X/iraq/" are supposed to be copies of "X/", with images removed in the first case. I suspect that downloading them all would just confirm that.
-- avva
I really would expect that preventing *spiders* (some spiders, even) using the *publicly accessible* robots.txt would be a pretty horribly ineffective form of "skullduggery"... can think of 10 things to do that are easier, more effective and less of a potential pr fiasco... see http://shock-awe.info/archive/000965.php FB`
participants (4)
-
Anatoly Vorobey
-
Declan McCullagh
-
Eugen Leitl
-
FB`