Are the interwebz rather big for google to index?

Steve Kinney admin at pilobilus.net
Thu Jun 14 09:35:14 PDT 2018



On 06/13/2018 11:08 AM, Georgi Guninski wrote:
> I strongly suspect at least one of the following holds:
> 
> 1. The interwebz are rather big for google to index
> 2. google doesn't return in searches all indexed content on purpose
> 
> Partial evidence: this list and my blog don't appear in searches.

Back around 2000 or so, when Google was under initial development, I
read about other search providers complaining because the NSA loaned
engineers to Google to help them develop their server farm architecture.
Since then I have often wished I had access to the Google search
utilities NSA analysts have.

Other search providers' complaints had nothing to do with the NSA being
"evil" but rather, with Uncle Sam interfering in commerce by playing
favorites:  'Why don't we get this kind of free handout?'  I don't
recall Google disputing the NSA's role in building the foundation of
their physical plant and/or algorithms.

In light of this paper from brookings.edu, not only do I suspect that
Google can index the whole of the publicly accessible Internet, but
cache all of the Internet's content as well:

https://www.brookings.edu/research/recording-everything-digital-storage-as-an-enabler-of-authoritarian-governments/

Professor Villasenor indicated that the storage density now exists to
record and retain /all/ surveillance data.  The redundancy of network
traffic (same web page to 10,000 users) cuts the size of the archive
down to a tiny fraction of the network's gross traffic volume.

The NSA's Bluffdale facility may greatly reduce Google's role in
indexing and content storage for Uncle Sam, but once established,
relationships like that only expire as/when the organizations themselves
and their relevant staff members expire.

Google strongly biases search results to reinforce individual "user
preferences" and consumer profiles, both to provide a more addictive
"user experience" and to herd users toward Google's real customers,
their advertising clients.  Seek and ye shall find plenty of information
on that subject; a whole industry (SEO) struggles to sell services based
on their analysis of Google bias, not to mention in-house efforts by
other "public relations" providers to crack Google's algorithms.  Search
is a battlefield.

The best example I remember of Google censorship involved a YouTube
video based on an article about The Facebook's funding sources -
describing how the said company was capitalized to go national by DARPA
financiers, and The Facebook's shocking abusive TOS.  After about 15
minutes of digging I gave up and concluded that the video had been taken
down and Google's memory of it expunged.

Later I did find a saved copy of the video in an old archive of mine.
Searching at Google using the video's exact title and file name did turn
it up on the first try.  Previously, the same words in a different order
and numerous very similar strings returned nothing related to the video
in question.  Searches for "approximate" versions of the title still
returned nothing useful.

I consider that a fine example of how Google itself engages in "negative
SEO" to suppress distribution of selected content while avoiding charges
of "censorship" - in a very narrow technical sense, the suppressed
content does remain available to the public.

"Don't Be Evil - That's OUR Job"

BTW, here's the thing:  https://www.youtube.com/watch?v=DIGdWsxHJlM

A "loose search" now turns up a different copy someone else uploaded,
probably after failing to find the original...

:o)










-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <http://lists.cpunks.org/pipermail/cypherpunks/attachments/20180614/3bace924/attachment.sig>


More information about the cypherpunks mailing list