Big Brother Really Is Watching
Robert L. Mitchell
January 14, 2008 (Computerworld) The year is 2012.
As soon as you walk into the airport, the machines are watching. Are you a
tourist -- or a terrorist posing as one?
As you answer a few questions at the security checkpoint, the systems begin
sizing you up. An array of sensors -- video, audio, laser, infrared -- feeds
a stream of real-time data about you to a computer that uses specially
developed algorithms to spot suspicious people.
The system interprets your gestures and facial expressions, analyzes your
voice and virtually probes your body to determine your temperature, heart
rate, respiration rate and other physiological characteristics -- all in an
effort to determine whether you are trying to deceive.
Fail the test, and you'll be pulled aside for a more aggressive interrogation
and searches.
That scenario may sound like science fiction, but the U.S. Department of
Homeland Security (DHS) is deadly serous about making it a reality.
Interest in the use of what some researchers call behavioral profiling (the
DHS prefers the term "assessing culturally neutral behaviors") for deception
detection intensified last July, when the department's human factors division
asked researchers to develop technologies to support Project Hostile Intent,
an initiative to build systems that automatically identify and analyze
behavioral and physiological cues associated with deception.
That project is part of a broader initiative called the Future Attribute
Screening Technologies Mobile Module, which seeks to create self-contained,
automated screening systems that are portable and relatively easy to
implement.
The DHS has aggressive plans for the technology. The schedule calls for an
initial demonstration for the Transportation Security Administration (TSA)
early this year, followed by test deployments in 2010. By 2012, if all goes
well, the agency hopes to begin deploying automated test systems at airports,
border checkpoints and other points of entry.
If successful, the technology could also be used in private-sector areas such
as building-access control and job-candidate screening. Critics, however, say
that the system will take much longer to develop than the department is
predicting -- and that it might never work at all.
In the Details
"It's a good idea fraught with difficulties," says Bruce Schneier, chief
technology officer at security consultancy BT Counterpane in Santa Clara,
Calif.
Schneier says that focusing on suspicious people is a better idea than trying
to detect suspicious objects. The metal-detecting magnetometers that airport
screeners have relied on for more than 30 years are easily defeated, he says.
But he thinks the technology needed for Project Hostile Intent to succeed is
still at least 15 years out. "We can't even do facial recognition," he says.
"Don't hold your breath."
But Sharla Rausch, director of the DHS's human factors division, says the
agency is already seeing positive results. In a controlled lab setting, she
says, accuracy rates are in the range of 78 to 81%. The tests are still
producing too many false positives, however. "In an operational setting, we
need to be at a higher level than that," Rausch says, and she's confident
that results will improve. At this point, though, it's still unclear how well
the systems will work in real-world settings.
Measuring Hostile Intent
Current research focuses on three key areas. The first is recognition of
gestures and so-called "microfacial expressions" -- a poker player might call
them "tells" -- that flash across a person's face in about one third of a
second. Some researchers say micro expressions can betray a person when he is
trying to deceive.
The second area is analysis of variations in speech, such as pitch and
loudness, for indicators of untruthfulness.
The third is measurement of physiological characteristics such as blood
pressure, pulse, skin moisture and respiration that have been associated with
polygraphs, or lie detectors.
By combining the results for all of these modalities, the DHS hopes to
improve the overall predictive accuracy rate beyond what the polygraph -- or
any other means of testing an individual indicator -- can deliver.
That's not a very high bar. The validity of polygraphs has long been
questioned by scientists, and despite decades of research and refinements,
the results of lie-detector tests remain inadmissible in court. While the
U.S. Department of Defense's Defense Academy for Credibility Assessment
(DACA; formerly the Polygraph Institute) puts median accuracy percentage for
polygraphs in the mid-80s when properly administered, others say that number
is closer to 50% in the real world and that the results depend heavily on the
skills of the examiner.
Schneier goes even further. He says lie detectors rely on "fake technology"
that works only in the movies. They remain on the scene, he says, because
people want them to work.
The presumption that combining the predictive results from the three areas
being studied will increase predictive accuracy is also untested. "We can't
find any indicators that this stuff is being combined [in current research].
The feeling is that [the DHS is] doing some groundbreaking stuff here," says
Rausch.
Hearing Lies
Many researchers are already tackling different pieces of the Hostile Intent
puzzle. Julia Hirschberg, a computer science professor at Columbia
University, is investigating how deception can be detected by picking up on
speech characteristics that vary when someone is lying. The research, funded
by a DHS grant, has identified 250 "acoustic, intonational and lexical
features" that may indicate when a subject is lying.
So far, the best accuracy rate is 67%. She admits that's "not great," but
it's better than human observation alone, she claims.
The results may not apply to real world situations, however. Her work is
based on lab experiments in which the subject presses a pedal when he is
lying, and machine-learning systems process the results. "It's not ideal,"
she acknowledges. Moreover, the accuracy rate in predicting deception varies
with cultural background as well as personality type. Hirschberg says she has
identified four or five personality types that could affect how the results
should be interpreted.
Adjusting for personality type might improve accuracy in cases where the type
can be identified, but it's doubtful that interviewers in an airport or
border setting will have the insight necessary to do so.
Dimitris Metaxas, a professor of computer science in biomedical engineering
at Rutgers University, has received funding from both the DHS and the DACA to
use technology to track and interpret the meaning of microexpressions and
gestures. "I'm trying to find the expressions and body movements that are not
normal and could be linked to deception," he says.
Metaxas says his research focuses on movements of the eyebrows and mouth as
well as various head and shoulder gestures, but he wouldn't be more specific.
That's because the exact indicators that he is interested in remain secret.
Although the DHS's Rausch believes that micro expressions are involuntary,
she doesn't want people to know exactly what expressions the agency will be
measuring -- just in case.
"Every system can be broken," Metaxas points out.
Objections and Obstacles
Skeptics say that no tech-based system will work.
The Ekman Group has trained TSA staffers on techniques to help them recognize
and interpret microexpressions. The consultancy was founded by Paul Ekman, a
pioneer in research linking microexpressions to deception. At the TSA,
trained officers use the techniques as part of the organization's Screening
Passengers Through Observation Techniques program.
John Yuille, the Ekman Group's director, doesn't think the technique can be
automated. The discipline is a "social science," he says, and
microexpressions merely represent "clues to truthfulness" that require human
interpretation. "Our methodology is not amenable to technological
intervention," Yuille says.
Metaxas says that what's holding him back at this point isn't technology.
"The basic technology to track the face, I've solved that problem," he says,
claiming an accuracy rate of 70 to 80% with cameras positioned at distances
up to nine feet from the subject.
The challenge is optimizing the algorithms that relate those expressions to
deception. To do that, he needs more data from psychologists. The theories
linking microexpressions to deception are largely based on academic research.
Although it has been tested in lab settings, it has not been scientifically
proved in large-scale, real-world studies.
Rules must also be applied in the correct context. For example, a measurement
of something like a microexpression must be associated with what was being
said at the time, and the meaning of what was said must be correctly
interpreted, says Hirschberg. The system must also be able to determine
whether there is a mismatch between a given expression or gesture and what
was said.
"That is very difficult [for a computer] to do," she says, so in the lab, the
matching work has been done manually.
In an effort to refine the algorithms, Metaxas has collaborated with Judee
Burgoon, a professor of communication, family studies and human development
at the University of Arizona. She says the lack of rigorous research
validating the use of microexpressions as indicators of deception "gives
everyone pause." It's not known whether microexpressions correspond with
underlying emotions or whether those emotional states correspond to
deception, she says.
Although it is believed that microexpressions are involuntary, it's unclear
whether subjects can "game the system," as they have done with polygraphs.
And many researchers in the field believe that indicators of deception are
culturally dependent. That means analysis that doesn't take cultural
background into account could amount to ethnic, rather than behavioral,
profiling. That's ironic, since using machines to analyze the data is
supposed to help eliminate biases associated with human decision-making.
In fact, the development of "culturally neutral" indicators is a stated goal
of Project Hostile Intent. Rausch believes that researchers can identify
microexpressions and other indicators that are universal or "cross-cultural."
That won't happen in time for the initial test systems. But by 2011, says
Rausch, the DHS should have test systems that use only culturally neutral
indicators.
For Metaxas, the challenge now is to prove that the fundamental assumptions
linking microexpressions to deception are correct. "What I hope I can do is
validate and verify the psychology," he says.
To do that he needs to conduct further tests involving interviews in
real-world situations. But that won't be easy. Privacy and security concerns
have prevented Metaxas and other researchers from monitoring interrogations
or conducting interviews in real-world settings such as airports or
immigration points. Even the DHS faces obstacles in testing the technology in
the field, Rausch acknowledges. And in real-world testing, says Hirshberg,
there's another problem: "You don't really know when the person's lying."
With an aggressive timeline for deployment, Rausch is well aware of the
challenges, and she cautions that the technology is far from complete. "We're
very much in a basic research stage," she says.
Beyond Hostile Intent
Project Hostile Intent is just one of the programs that the DHS's human
factors division is pursuing. Another is violent-intent modeling. By applying
social behavior theory to terrorism, the division is hoping to assist
analysts that must manually sift through thousands for publications, news
feeds and other data.
Researchers are developing indicators for potential violent behavior, which
are used in computerized architectural frameworks that help analysts extract
relevant data as they review documents. "Computers help in running the
models. As you put the data together, you get likelihood coefficients for
violent behavior. Our goal is to get that automated for the analysts," says
Rausch.
The "information-extraction tools" will assist analysts by identifying
important information as they're reading it, but they won't replace analysts.
"We're doing it in a way that's consistent with the way analysts think,"
Rausch says.
Another developing area is biometrics. Research is focused on developing
mobile readers that can perform facial, fingerprint and iris recognition. "As
we push out in years, we'll get into remote biometric [sensors]," as well as
more refined, "10-print" fingerprint recognition, says Rausch.
The systems will tap into "huge databases for identification and
verification," she says.
Other TSA Technologies
The TSA may eventually use the behavior profiling systems that come from
Project Hostile Intent, but that's just one part of the agency's
transportation security strategy. The layered approach includes "a technology
factor, a human factor and shared intelligence," a spokesperson says.
The TSA's passenger screening technology hasn't changed since the
magnetometer, a metal detector, was introduced in 1973, but it's working on
other technologies including a so-called advanced technology X-ray. This
high-resolution X-ray system provides clearer images of the contents of
carry-on baggage and offers multiple viewing angles. The machines are already
widely used in Europe. The TSA has purchased 250 of them and plans to have a
total of 500 installed by the end of 2008.
That's a fraction of the 751 checkpoints and 2,000 lanes in service, but 500
machines is enough to cover 75% of the security lanes at the nation's largest
airports, which represent 45% of all travelers.
Another technology is the puffer machine. The subject walks into this phone
booth-like device, and translucent bifold doors close around him. The machine
then blasts the subject with a burst of compressed air and analyzes it for
trace amounts of explosives. The puffer is already in testing in some
airports but hasn't worked well. "They're OK, but I think we'll go more in
the direction of whole-body imaging," says a spokesperson.
In whole body imaging, a machine bombards the subject with radio-frequency
energy as he walks through and creates a very accurate image of his body --
perhaps too accurate -- in order to detect any foreign objects. "There's a
whole lot of privacy issues with this," a spokesperson acknowledges.
The TSA is testing two technologies: One, called back scatter, uses a privacy
algorithm that changes the image to a "chalk outline" of the body while the
other, called millimeter wave, creates what looks like a negative.
To address privacy concerns, facial images are blurred, and images aren't
saved. In addition, the screener who sees the passenger never sees the
images.
The machines are already in use in Phoenix, where passengers can choose a
pat-down instead, and will show up at Los Angeles International Airport and
John F. Kennedy International Airport soon. "You'll see more whole-body
imaging [in 2008], a spokesperson says.
Caveats and Ethical Issues
Even if Project Hostile Intent ultimately succeeds, it will not be a panacea
for preventing terrorism, says Schneier. The risk can be reduced, but not
eliminated, he says. "If we had perfect security in airports, terrorists
would go bomb shopping malls," he says. "You'll never be secure by defending
targets."
Assuming that the system gets off the ground, Project Hostile Intent also
faces challenges from privacy advocates.
Although the system would use remote sensors that are physically
"noninvasive," and there are no plans to store the information, the amount of
personal data that would be gathered concerns privacy advocates -- as does
the possibility of false positives.
"We are not going to catch any terrorists, but a lot of innocent people,
especially racial and ethnic minorities, are going to be trapped in a web of
suspicion," says Barry Steinhardt, director of the Technology and Liberty
Project at the American Civil Liberties Union in Washington.
But Steinhardt isn't really worried. He says Project Hostile Intent is just
the latest in a long string of expensive and failed initiatives at the DHS
and the TSA. "I've done hundreds of interviews about these [airline-passenger
screening] schemes," he says. "They never work." Steinhardt adds that
"hundreds of billions" of dollars have been wasted on such initiatives since
9/11. "Show me it works before [we] debate the civil liberties consequences,"
he says. The perfect lie detector may be waiting in the wings. Read all
about it.