Gabrielle Plucknette-DeVito
Every
week, the cybersecurity research team meets for a “scrum” to discuss the
latest updates on their Tor security projects and bounce ideas for new
attacks and defenses off each other.
Recognizing that the internet is not always secure, millions
of people are turning to the Tor anonymity system as a way to browse the
World Wide Web more privately.
However, Tor has been found to have its own vulnerabilities,
including an attack known as website fingerprinting. This has a team of
faculty and students from RIT’s Center for Cybersecurity researching the
extent of the problem and ways to address it.
Led by Matthew Wright, director of the center, and supported by a
series of projects funded by the National Science Foundation, the team
aims to think like future attackers so it can develop defenses that will
last. The result: creating new attacks and defenses that use the latest
advances in deep learning.
“Deep learning has proven to be effective in so many applications,”
said Wright, who is also a professor of computing security. “From
self-driving cars to voice recognition in smart home speakers—it’s just a
matter of time before attackers take advantage of those same
techniques.”
Privacy for all
With more than 8 million daily users, Tor has become a popular free
tool for activists, law enforcement, businesses, military, people living
in countries with censorship, and even regular privacy-conscious
individuals.
“When journalists need to communicate more safely with whistleblowers
and dissidents, they often use Tor,” said Wright. “We need this more
secure way to access the internet because it’s essential to our freedom
of speech and privacy.”
Wright explained that Tor creates a secure browsing experience by
encrypting all its connections and sending traffic on a path through
several random servers, rather than making a direct connection to the
user’s desired website. It protects against snooping on which sites a
user visits, such as sites on sensitive issues like religion, health, or
politics.
With the website fingerprinting attack, local eavesdroppers or
internet service providers can collect the encrypted traffic and
identify which website the user is visiting based on specific patterns
in the traffic. While hackers can’t actually see what a user did on the
website, they have already learned something that the user is trying to
protect.
Deep fingerprinting
Tor developers were considering two defenses against website fingerprinting that could cut the attack’s accuracy in half.
Payap Sirinam, a computing and information sciences Ph.D. student,
was tasked with exploring the potential for deep learning in the website
fingerprinting attack.
Adversaries are going to develop this technology themselves anyway,
so the RIT team wanted to figure out how future attacks might work.
While the first website fingerprinting attack used machine-learning
classifiers with manually developed features to analyze traffic, the
team’s new attack would use deep learning, which extracts features
automatically.
“You manually train a machine-learning computer to recognize patterns
in web traffic that humans can’t see—that’s why it’s so good at this
attack,” said Sirinam, who is from Thailand. “By using deep learning,
attackers are essentially able to spend less time training, while
finding even more patterns that they can use to identify a website.”
The RIT team’s new attack, called Deep Fingerprinting, was based on a
Convolutional Neural Network (CNN) that was designed using cutting-edge
deep-learning methods. The attack automatically extracts features from
packet traces and does not require handcrafting features for
classification.
After thousands of hours running trace experiments in a closed-world
setting, the new attack outperformed all previous state-of-the-art
website fingerprinting attacks. The attack was 98 percent effective
against Tor. Even against existing defenses, Deep Fingerprinting had
more than 90-percent accuracy.
The Deep Fingerprinting project included work from Sirinam; Professor
Wright; Marc Juarez, a Ph.D. student at the Belgian research university
KU Leuven; and Mohsen Imani, a former Ph.D. student of Wright’s at
University of Texas at Arlington. A paper on the NSF-sponsored work was a
finalist for an Outstanding Paper Award, placing it in the top 1
percent of all submitted papers, at the 2018 ACM Conference on Computer
and Communications Security in Toronto.
“Now that we know which defenses aren’t going to work against the new
top-level attacks, it’s up to us to create defenses that do,” said
Sirinam.
Upping our defense
Nate Mathews, a fourth-year computing security major, finds it fun to
work with really difficult and ambiguous problems. However, the dilemma
he’s currently trying to solve is one that his mentor created.
Working together with Sirinam, Mathews is trying to better understand
why the Deep Fingerprinting attack is so effective, in order to develop
a defense that can stop it.
Mathews describes deep learning as a black box. Researchers put data
in and output arrives at the other end. But it’s difficult to see the
inner workings of the box.
“If we could figure out which data features the deep learning thinks
is important, we can identify the particular regions to defend,” said
Mathews, who is from Ross, Ohio.
To help visualize which parts of a trace are most important to the
classification decision made by deep learning, the team is applying the
GradCAM technique. Traditionally used in image classification, GradCAM
generates heatmaps that show what parts of the trace the deep learning
algorithm is focusing on.
Using their findings, Mathews and Sirinam are proposing ways to add
fake packets to these important parts of the trace, which can confuse
the deep-learning algorithms.
“It’s like adding noise to a picture of a cat, so you can hide what
kind of animal it is,” said Wright. “You can add noise to the entire
picture, but that’s expensive in our setting. But if we can obscure the
ears and the face, it might be enough.”
Saidur Rahman, a computing and information sciences Ph.D. student,
and Aneesh Yogesh Joshi, a computer science master’s degree student from
India, are also developing a new defense strategy that is meant to
trick the deep learning.
Known as the adversarial examples defense, it uses deep learning to
add packets and modify website traces in a way that causes the
classifier to misclassify.
“We borrowed the idea from the domain of computer vision, where you
can distort patterns in the model,” said Rahman, who is from Bangladesh.
“This defense can make Facebook traffic look like Google traffic.”
Before implementing any new defense, the team needs to complete
thousands of experiments in closed-world and more realistic open-world
settings. They also need to take bandwidth and latency overhead into
account. If a defense is going to slow the system down to a halt, users
may find that the benefits no longer outweigh the cost.
Bolstering the attacks
Taking it one step further, the experts at RIT are trying to find
other attacks they could use to test the robustness of their defenses.
They are developing Tik-Tok, an attack that uses packet timing
information. Prior attacks discounted timing information because the
characteristics change on each visit to a site, making it hard to
extract patterns.
“We saw this as a largely untapped resource and something that might
benefit from adding deep-learning classifiers,” said Rahman. “We
selected and extracted eight new timing features that provide a lot of
value.”
Preliminary results indicate that Tik-Tok could be a successful attack in the future.
Sirinam is also developing a new attack and subsequent defense as the
last part of his dissertation. Using a branch of deep learning that he
borrowed from facial recognition, he plans to create an attack that is
more realistic than Deep Fingerprinting.
While the Deep Fingerprinting model may require 1,000 examples from
each website to classify correctly, the new n-shot learning with triplet
networks concept allows a classifier to learn from only five examples.
“N-shot learning is like an eco-car that requires fewer resources and
has reasonably good performance, while the sports car—like Deep
Fingerprinting—requires rich resources in order to perform at its best,”
said Sirinam. “This shows the danger of website fingerprinting attacks,
even with less powerful adversaries, so we need to figure out a way to
stop them.”
Wright said that throughout these research projects, the Tor
community has been an amazing partner and appreciative of RIT’s efforts.
Many of these defenses could be implemented on Tor in the next two to
three years.
“We know that our defenses will likely be broken in the future—that’s
the nature of cybersecurity,” said Wright. “But we are coming up with
solutions that will help people around the world stay safe for the time
being, and I think that’s what really matters now.”
Construction is underway for RIT’s Global Cybersecurity Institute,
which will help the university become a nexus of cybersecurity education
and research.
The three-story facility will include a cyber learning experience
center, a simulated security operations center, labs, and offices. The
institute will address the critical workforce needs in cybersecurity
through education and professional development programs.
It is expected to open in July 2020 and will be the first facility of its kind in upstate New York.