I'd suggest "wget" for spidering sites. It can be told to ignore .robots files. It is good for mirroring sites which you suspect may be taken down. Win/Unix versions available.