Web spy software hacks into secretive online forums

Eugen Leitl eugen at leitl.org
Wed Apr 14 08:23:33 PDT 2010


http://www.newscientist.com/article/mg20627555.700-web-spy-software-hacks-into-secretive-online-forums.html?full=true&print=true 

Web spy software hacks into secretive online forums

    * 13 April 2010 by Shehryar Mufti

    * Magazine issue 2755. Subscribe and get 4 free issues.

    * For similar stories, visit the Crime and Forensics Topic Guide

THE dark corners of cyberspace are being illuminated by indexing software
that can reach into secretive websites that are normally inaccessible to
search engines. This could allow search engines to cover online forums
lurking within the "dark web", and provide insights into what is being said
by groups who would rather keep their conversations secret.

Conventional search engines use programs called spiders or web crawlers that
scuttle around the internet and index what they find. However, many websites
are protected by security restrictions that fend off such software. Screening
out all traffic from IP addresses belonging to well-known search engines is
one way to do this.

The dark web can provide a haven for extremist groups to exchange ideas, says
Hsinchun Chen, director of the artificial intelligence laboratory at the
University of Arizona in Tucson. So Chen and his team devised software to
access and index protected online forums (Journal of the American Society for
Information Science and Technology, DOI: 10.1002/asi.21323).

One of the tricks deployed by Chen's software is to regularly change the
apparent IP address of the computer on which it is running. The software also
disguises its indexing activity by making it look like the traffic generated
by users browsing the forum. What's more, it can attempt to sign up for
membership on forums that require registration, though it has to seek help
from Chen's team if unusual information is asked for. To help it index text
in languages other than English it uses Google Translate, Google's online
translation engine.  The software disguises its indexing activity to look
like traffic generated by users browsing the forum

Unlike a regular web crawler, Chen's software looks only at sites he has
specified. It has compiled data on 29 restricted forums, containing about 13
million messages in total. On one forum, it took just 39 minutes to index
29,016 posts made over a six-week period.

Chen's team is now analysing the conversations on these forums to build an
overview of the links between participants. He suggests this may be useful in
identifying prominent members.

The impressive thing about Chen's forum crawler is the way it combines human
guidance and automated web searches to catalogue dark web forums, says Denis
Roy, a spokesman for Yahoo. "The name of the game," he says, is to "find the
right blend of the least possible number of humans and machines" to perform
this indexing of restricted websites efficiently.





More information about the cypherpunks-legacy mailing list