Steve Mynott[SMTP:steve@tightrope.demon.co.uk] wrote: Every thought how bad the net would be if google went away?
Actually, us old timers remember what it was like before google,
Hypertext in general is doubleplus good.
I remember Gopher, Archie, and all of those crufty tools.
You are all missing the point. Google was being praised for its specific feature of acting as an internet-wide cache of old versions of web pages. Prior to 9/11 the page in question held inflammatory content praising bin Laden. Now it's been pulled. The google cache lets us see the previous version, thwarting the efforts of the page owner to hide his earlier sentiments. Google thus serves as an honesty mechanism, holding people responsible for what they have said and making it more difficult for them to conceal revisions to their published opinions. All this old-fogey talk about "I remember the days before Google" is nothing but hot air. Sure, things were harder before search engines; of course they were, but it's a trivial observation. The point is that this often-unappreciated cacheing feature of search engines can have a powerful influence on the nature of the Web. It's unfortunate that we have to rely on Google. Imagine an ongoing, distributed project to cache the web. Volunteers could keep tabs on a subset of corporate and personal web pages and cache old versions when changes are made. Rewriting history becomes that much harder. And it's certainly a better use of computers than seti@home.
On Tue, Oct 09, 2001 at 11:40:14PM +0200, Nomen Nescio wrote:
Google thus serves as an honesty mechanism, holding people responsible for what they have said and making it more difficult for them to conceal revisions to their published opinions.
Not necessarily. Anyone who wants to post something controversial and deny it later will hide it behind a cgi script or something Google won't index. Google caches catch only those who didn't take precautions.
It's unfortunate that we have to rely on Google. Imagine an ongoing, distributed project to cache the web. Volunteers could keep tabs on a subset of corporate and personal web pages and cache old versions when changes are made. Rewriting history becomes that much harder. And it's certainly a better use of computers than seti@home.
Sounds great. Good luck in starting it. I'm sure you've estimated how much this will cost -- and what market demand will be. -Declan (online since early 1980s, on the Internet since 1988, who still thinks FAQs are a great way to find info)
On Tuesday, October 9, 2001, at 02:40 PM, Nomen Nescio wrote:
Steve Mynott[SMTP:steve@tightrope.demon.co.uk] wrote: Every thought how bad the net would be if google went away?
Actually, us old timers remember what it was like before google,
Hypertext in general is doubleplus good.
I remember Gopher, Archie, and all of those crufty tools.
You are all missing the point. Google was being praised for its specific feature of acting as an internet-wide cache of old versions of web pages. Prior to 9/11 the page in question held inflammatory content praising bin Laden. Now it's been pulled. The google cache lets us see the previous version, thwarting the efforts of the page owner to hide his earlier sentiments.
I'm not "missing the point." I chose to talk about something else. Deal with it. Fuck off. --Tim May, Occupied America "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -- Benjamin Franklin, 1759.
At 11:40 PM 10/9/01 +0200, Nomen Nescio wrote:
Google thus serves as an honesty mechanism, holding people responsible for what they have said and making it more difficult for them to conceal revisions to their published opinions.
...
It's unfortunate that we have to rely on Google. Imagine an ongoing, distributed project to cache the web. Volunteers could keep tabs on a subset of corporate and personal web pages and cache old versions when changes are made. Rewriting history becomes that much harder. And it's certainly a better use of computers than seti@home.
Very nice analysis. If you greatly reduce Google's speed of search, what kind of compression-gains can you get? Imagine an archive which is highly compressed [1] but used mostly to counter censorship. [1] That JPGs etc. are already highly compressed means that if you keep pictures, you won't gain as much by trading off search speed for compression.
Huh? There's little to search with when it comes to images, unless you pull from the surrounding content, which Google does. There are several orders of magnitude difference between storing web page-size content and the kind of filename-size content that would appear in image titles and descriptions. In other words, an index of just JPG image titles would be far smaller than an index of the same number of web pages (though the binary files themselves, which won't be indexed, would likely consume more space). -Declan On Tue, Oct 09, 2001 at 06:50:40PM -0700, David Honig wrote:
Very nice analysis.
If you greatly reduce Google's speed of search, what kind of compression-gains can you get? Imagine an archive which is highly compressed [1] but used mostly to counter censorship.
[1] That JPGs etc. are already highly compressed means that if you keep pictures, you won't gain as much by trading off search speed for compression.
At 11:23 PM 10/9/01 -0400, Declan McCullagh wrote:
Huh? There's little to search with when it comes to images, unless you pull from the surrounding content, which Google does.
But google doesn't save the images, including navigational images (which tend to be .gifs) nor does it -traversal -crawl And the dynamic (CGI) site problem. Also, I realize this is a lot of bandwidth. Perhaps sites could automatically self-nominate for mirroring? Ie, Joe Sixpack putting up his housecat site won't bother; but a dissident would. But that's a Freenet-type scheme.
There are several orders of magnitude difference between storing web page-size content and the kind of filename-size content that would appear in image titles and descriptions.
Yes and I realized shortly after posting that Google probably is smart about compressing what can be. Basically I need dense (but slow) nonvolitile memory prices to decrease, not software. Eventually tech could outpace human output. Everyone would have a Slab containing the history (and all uncopyrighted and copyrighted works, the latter licensable of course :-) from the Sumerians to last month's concerts. And everyone painting and singing until the sun burnt out would not fill another Slab. Meanwhile make backups. And mirror the twisted. :-)
David Honig <honig@sprynet.com> writes:
At 11:23 PM 10/9/01 -0400, Declan McCullagh wrote:
Huh? There's little to search with when it comes to images, unless you pull from the surrounding content, which Google does.
But google doesn't save the images, including navigational images (which tend to be .gifs) nor does it -traversal -crawl
And the dynamic (CGI) site problem.
Dynamic sites should not be a problem, unless they require users to submit forms before they see the real content. A google cache of a dynamic site eg omor.com might have something like 'This page Optimized for 216.239.46.66, googlebot(at)googlebot.com, using Googlebot/2.1 (+http://www.googlebot.com/bot.html) . YY
But dynamic sites are a problem. Search engines are often reluctant to index through cgi scripts because of the possibility of infinite loops of on-the-fly generated pages with unique URLs. Some folks call this the hidden web. -Declan On Sat, Oct 20, 2001 at 11:07:24PM -0400, Yeoh Yiu wrote:
David Honig <honig@sprynet.com> writes:
At 11:23 PM 10/9/01 -0400, Declan McCullagh wrote:
Huh? There's little to search with when it comes to images, unless you pull from the surrounding content, which Google does.
But google doesn't save the images, including navigational images (which tend to be .gifs) nor does it -traversal -crawl
And the dynamic (CGI) site problem.
Dynamic sites should not be a problem, unless they require users to submit forms before they see the real content.
A google cache of a dynamic site eg omor.com might have something like
'This page Optimized for 216.239.46.66, googlebot(at)googlebot.com, using Googlebot/2.1 (+http://www.googlebot.com/bot.html) .
YY
On Tuesday, 09 Oct 2001 at 23:40, Nomen Nescio <nobody@dizum.com> wrote:
Steve Mynott[SMTP:steve@tightrope.demon.co.uk] wrote: Every thought how bad the net would be if google went away?
Actually, us old timers remember what it was like before google,
Hypertext in general is doubleplus good.
I remember Gopher, Archie, and all of those crufty tools.
You are all missing the point. Google was being praised for its specific feature of acting as an internet-wide cache of old versions of web pages.
You cut out some lines:
Actually, us old timers remember what it was like before google, or even altavista (the first real search engine). Heck, some of us remember what it was like before Usenet...
someone wrote about Altavista's cache of USENET back in '93. Cypherpunkly issues of your past coming back to haunt you, and blurring of private/professional identity followed. As to how long google can be an archiver of the whole net... altavista went about 6 years before its owner started looking to sell it. And that didn't carry binary content. Now that's owned by google as well.
At 11:40 PM 10/9/2001 +0200, Nomen Nescio wrote:
You are all missing the point. Google was being praised for its specific feature of acting as an internet-wide cache of old versions of web pages. Prior to 9/11 the page in question held inflammatory content praising bin Laden. Now it's been pulled. The google cache lets us see the previous version, thwarting the efforts of the page owner to hide his earlier sentiments.
Google thus serves as an honesty mechanism, holding people responsible for what they have said and making it more difficult for them to conceal revisions to their published opinions.
All this old-fogey talk about "I remember the days before Google" is nothing but hot air. Sure, things were harder before search engines; of course they were, but it's a trivial observation. The point is that this often-unappreciated cacheing feature of search engines can have a powerful influence on the nature of the Web.
It's unfortunate that we have to rely on Google. Imagine an ongoing, distributed project to cache the web. Volunteers could keep tabs on a subset of corporate and personal web pages and cache old versions when changes are made. Rewriting history becomes that much harder. And it's certainly a better use of computers than seti@home.
This is actually an excellent application front end for a Mojo Nation type system, as the data is redundant and distributed.
participants (7)
-
David Honig
-
Declan McCullagh
-
Nomen Nescio
-
Ralph Seberry
-
Steve Schear
-
Tim May
-
Yeoh Yiu