Thanks for your comments on this folks ... lots of food for thought.

coderman@gmail.com said:

the longer discussion is how to make decentralized search useful. "Google style" search has a terrific performance advantage over decentralized designs by brute force. however, take advantage of massive endpoint / peer processing and resources combined with implicit observational metrics for reputation and recommendation, inside a well integrated framework for resource discovery in usable software, and you have something more robust and more effective than "Google style" could ever provide.

Yes, it does seem like the speed advantage of centralized search will be a barrier to adoption of decentralized search. This is analogous to the difficulty getting people adopt systems like Tor because it is slow. But I think that as more people become aware of the extent of state/corporate surveillance, they will become more inclined to accept solutions that are slower in exchange for not having their search habits monitored, and also being able to receive uncensored search results. As long as decentralized search is (a) usable/simple and (b) provides quality results, I feel like speed is somewhat of a secondary concern. The key question to me is: "How do we build a search engine that is simple enough for Grandma to use, that produces quality results without massive centralized indexing servers?"

Standalone P2P search applications (e.g. Yacy) don't really make sense from a usability perspective. It's unrealistic to expect hundreds of millions of users to download a standalone Java app, and configure a P2P search node. What would make more sense, and would lead to much more rapid/widespread adoption, is to use protocols like WebSockets / WebRTC to facilitate P2P connectivity in the web browser, so that everything can be done via a simple browser plugin that can be installed by anyone with few clicks, and would then just allow people to use the browser search bar as usual. This browser integration would also have the bonus of simplifying the choice of what to index -- it could just default to indexing bookmarked and frequently-visited pages, and then be optionally customized by more advanced users to create custom indexes (i.e. all of the complexity of setting up indexing could be hidden from the user, unless they choose to look for it).

To help bootstrap the WebRTC nodes into the P2P network, and to deal with some of the instability inherent in P2P networks (i.e. by creating stable "super-peer" indexing nodes), I like cathalgarvey's suggestion of utilizing something like a Wordpress plugin that would use the same index/search standard as the WebRTC clients, but could additionally bootstrap the web-based clients. As cathalgarvey said:

A standard rather than a codebase. But there's a huge advantage to this line of thought, if you'll bear with me. A two-digit fraction of the web right now is powered by Wordpress.org, who explicitly advocate open/free culture. If you can convince them to include a social search/index standard of this type, which is virtually free in terms of computer resources, then you'd have it deployed across the web in days as the next update rolled out. Indeed, even if Wordpress seemed reluctant, a wordpress plugin could probably be written quickly enough to enable such a thing and make it available for casual use. Suddenly, a bunch of PHP-powered sites around the web start committing small bits and pieces of resources to a social search engine based on human-curated attestations of trust that flow through a web, helping to confine spammers to the fringes and to users with stupid taste.

What would also be interesting is if this standard enabled some kind of "pingback" mechanism whereby WebRTC nodes could be associated with specific super-peer nodes (e.g. maybe people who have bookmarked the super-peer site in their browser, or subscribe to its feed), so that in addition to broad/random queries that target the entire P2P network, clients could also create more targeted custom searches that say something like "start the search with the nodes that are clustered around these super-peers". This would create an enormous diversity of search possibilities -- hundreds of thousands (millions) of different "search engines", each of which would return different results for the same query, depending on where you start your search ... This diversity is another reason I find P2P search interesting, in addition to the benefits re: censorship, traffic shaping, and surveillance.

I've been looking around for some kind of WebRTC P2P search engine and haven't found anything yet ... maybe I've found a programming project for this summer :)

-- Jesse Taylor