[IP] US plans massive data sweep

Dave Farber dave at farber.net
Thu Feb 9 03:43:48 PST 2006

-------- Original Message --------
Subject: 	US plans massive data sweep
Date: 	Wed, 08 Feb 2006 19:44:53 -0500 (EST)
From: 	TruChaos at aol.com
To: 	dave at farber.net

*US plans massive data sweep*

Little-known data-collection system could troll news, blogs, even
e-mails. Will it go too far?

By Mark Clayton

The US government is developing a massive computer system that can
collect huge amounts of data and, by linking far-flung information from
blogs and e-mail to government records and intelligence reports, search
for patterns of terrorist activity.
The system - parts of which are operational, parts of which are still
under development - is already credited with helping to foil some plots.
It is the federal government's latest attempt to use broad
data-collection and powerful analysis in the fight against terrorism.
But by delving deeply into the digital minutiae of American life, the
program is also raising concerns that the government is intruding too
deeply into citizens' privacy.

"We don't realize that, as we live our lives and make little choices,
like buying groceries, buying on Amazon, Googling, we're leaving traces
everywhere," says Lee Tien, a staff attorney with the Electronic
Frontier Foundation. "We have an attitude that no one will connect all
those dots. But these programs are about connecting those dots -
analyzing and aggregating them - in a way that we haven't thought about.
It's one of the underlying fundamental issues we have yet to come to
grips with."

The core of this effort is a little-known system called Analysis,
Dissemination, Visualization, Insight, and Semantic Enhancement
(ADVISE). Only a few public documents mention it. ADVISE is a research
and development program within the Department of Homeland Security
(DHS), part of its three-year-old "Threat and Vulnerability, Testing and
Assessment" portfolio. The TVTA received nearly $50 million in federal
funding this year.

DHS officials are circumspect when talking about ADVISE. "I've heard of
it," says Peter Sand, director of privacy technology. "I don't know the
actual status right now. But if it's a system that's been discussed,
then it's something we're involved in at some level."

Data-mining is a key technology

A major part of ADVISE involves data-mining - or "dataveillance," as
some call it. It means sifting through data to look for patterns. If a
supermarket finds that customers who buy cider also tend to buy
fresh-baked bread, it might group the two together. To prevent fraud,
credit-card issuers use data-mining to look for patterns of suspicious

What sets ADVISE apart is its scope. It would collect a vast array of
corporate and public online information - from financial records to CNN
news stories - and cross-reference it against US intelligence and
law-enforcement records. The system would then store it as "entities" -
linked data about people, places, things, organizations, and events,
according to a report summarizing a 2004 DHS conference in Alexandria,
Va. The storage requirements alone are huge - enough to retain
information about 1 quadrillion entities, the report estimated. If each
entity were a penny, they would collectively form a cube a half-mile
high - roughly double the height of the Empire State Building.

But ADVISE and related DHS technologies aim to do much more, according
to Joseph Kielman, manager of the TVTA portfolio. The key is not merely
to identify terrorists, or sift for key words, but to identify critical
patterns in data that illumine their motives and intentions, he wrote in
a presentation at a November conference in Richland, Wash.

For example: Is a burst of Internet traffic between a few people the
plotting of terrorists, or just bloggers arguing? ADVISE algorithms
would try to determine that before flagging the data pattern for a human
analyst's review.

At least a few pieces of ADVISE are already operational. Consider
Starlight, which along with other "visualization" software tools can
give human analysts a graphical view of data. Viewing data in this way
could reveal patterns not obvious in text or number form. Understanding
the relationships among people, organizations, places, and things -
using social-behavior analysis and other techniques - is essential to
going beyond mere data-mining to comprehensive "knowledge discovery in
databases," Dr. Kielman wrote in his November report. He declined to be
interviewed for this article.

One data program has foiled terrorists

Starlight has already helped foil some terror plots, says Jim Thomas,
one of its developers and director of the government's new National
Visualization Analytics Center in Richland, Wash. He can't elaborate
because the cases are classified, he adds. But "there's no question that
the technology we've invented here at the lab has been used to protect
our freedoms - and that's pretty cool."

As envisioned, ADVISE and its analytical tools would be used by other
agencies to look for terrorists. "All federal, state, local and
private-sector security entities will be able to share and collaborate
in real time with distributed data warehouses that will provide full
support for analysis and action" for the ADVISE system, says the 2004
workshop report.

Some antiterror efforts die - others just change names
Defense Department

November 2002 - The New York Times identifies a counterterrorism program
called Total Information Awareness.

September 2003 - After terminating TIA on privacy grounds, Congress
shuts down its successor, Terrorism Information Awareness, for the same

Department of Homeland Security

February 2003 - The department's Transportation Security Administration
(TSA) announces it's replacing its 1990s-era Computer-Assisted Passenger
Prescreening System (CAPPS I).

July 2004 - TSA cancels CAPPS II because of privacy concerns.

August 2004 - TSA says it will begin testing a similar system - Secure
Flight - with built-in privacy features.

July 2005 - Government auditors charge that Secure Flight is violating
privacy laws by holding information on 43,000 people not suspected of

A program in the shadows

Yet the scope of ADVISE - its stage of development, cost, and most other
details - is so obscure that critics say it poses a major privacy challenge.

"We just don't know enough about this technology, how it works, or what
it is used for," says Marcia Hofmann of the Electronic Privacy
Information Center in Washington. "It matters to a lot of people that
these programs and software exist. We don't really know to what extent
the government is mining personal data."

Even congressmen with direct oversight of DHS, who favor data mining,
say they don't know enough about the program.

"I am not fully briefed on ADVISE," wrote Rep. Curt Weldon (R) of
Pennsylvania, vice chairman of the House Homeland Security Committee, in
an e-mail. "I'll get briefed this week."

Privacy concerns have torpedoed federal data-mining efforts in the past.
In 2002, news reports revealed that the Defense Department was working
on Total Information Awareness, a project aimed at collecting and
sifting vast amounts of personal and government data for clues to
terrorism. An uproar caused Congress to cancel the TIA program a year later.

Echoes of a past controversial plan

ADVISE "looks very much like TIA," Mr. Tien of the Electronic Frontier
Foundation writes in an e-mail. "There's the same emphasis on broad
collection and pattern analysis."

But Mr. Sand, the DHS official, emphasizes that privacy protection would
be built-in. "Before a system leaves the department there's been a
privacy review.... That's our focus."

Some computer scientists support the concepts behind ADVISE.

"This sort of technology does protect against a real threat," says
Jeffrey Ullman, professor emeritus of computer science at Stanford
University. "If a computer suspects me of being a terrorist, but just
says maybe an analyst should look at it ... well, that's no big deal.
This is the type of thing we need to be willing to do, to give up a
certain amount of privacy."

Others are less sure.

"It isn't a bad idea, but you have to do it in a way that demonstrates
its utility - and with provable privacy protection," says Latanya
Sweeney, founder of the Data Privacy Laboratory at Carnegie Mellon
University. But since speaking on privacy at the 2004 DHS workshop, she
now doubts the department is building privacy into ADVISE. "At this
point, ADVISE has no funding for privacy technology."

She cites a recent request for proposal by the Office of Naval Research
on behalf of DHS. Although it doesn't mention ADVISE by name, the
proposal outlines data-technology research that meshes closely with
technology cited in ADVISE documents.

Neither the proposal - nor any other she has seen - provides any funding
for provable privacy technology, she adds.


You are subscribed as eugen at leitl.org
To manage your subscription, go to

Archives at: http://www.interesting-people.org/archives/interesting-people/

----- End forwarded message -----
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
ICBM: 48.07100, 11.36820            http://www.ativel.com
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE

[demime 1.01d removed an attachment of type application/pgp-signature which had a name of signature.asc]

More information about the cypherpunks-legacy mailing list