[random] A Paper on Measuring Attackability of Code Generation

Undescribed Horrific Abuse, One Victim & Survivor of Many gmkarl at gmail.com
Fri Oct 21 05:33:19 PDT 2022


https://arxiv.org/abs/2206.00052

CodeAttack: Code-based Adversarial Attacks for Pre-Trained
Programming Language Models
Akshita Jha
Virginia Tech, Arlington, VA
akshitajha at vt.edu
Chandan K. Reddy
Virginia Tech, Arlington, VA
reddy at cs.vt.edu
Abstract
Pre-trained programming language (PL) models (such as CodeT5,
CodeBERT, GraphCodeBERT, etc.,) have the potential to automate
software engineering tasks involving code understanding and code
generation. However,
these models are not robust to changes in
the input and thus, are potentially susceptible to adversarial
attacks. We propose,
CodeAttack, a simple yet effective blackbox attack model that uses
code structure to
generate imperceptible, effective, and minimally perturbed adversarial
code samples. We
demonstrate the vulnerabilities of the stateof-the-art PL models to
code-specific adversarial attacks. We evaluate the transferability of
CodeAttack on several code-code
(translation and repair) and code-NL (summarization) tasks across
different programming
languages. CodeAttack outperforms stateof-the-art adversarial NLP
attack models to
achieve the best overall performance while being more efficient and
imperceptible.
1 Introduction
There has been a recent surge in the development
of general purpose programming language (PL)
models (such as CodeT5 (Wang et al., 2021), CodeBERT (Feng et al.,
2020), GraphCodeBERT (Guo
et al., 2020), and PLBART (Ahmad et al., 2021))
which can capture the relationship between natural
language and programming language, and potentially automate software
engineering development
tasks involving code understanding (clone detection, defect detection)
and code generation (codecode translation, code-code refinement,
code-NL
summarization). However, given the data driven
pre-training of these PL models on massive code
data, their robustness and vulnerabilities need careful investigation.
In this work, we demonstrate the
vulnerability of the state-of-the-art programming
language models by generating adversarial samples
that leverage code structure.
Figure 1: CodeAttack makes a slight modification
to the input code snippet (red) which causes significant
changes to the code summary obtained from the SOTA
pre-trained programming language models. Keywords
are highlighted in blue and comments in green.
Adversarial attacks are characterized by imperceptible changes in the
input that result in incorrect
predictions from a neural network. For PL models,
they are important for three primary reasons: (i) Exposing system
vulnerabilities: As a form of stress
test to understand the model’s limitations. For example, adversarial
samples can be used to bypass
a defect detection filter that classifies a given code
as vulnerable or not (Zhou et al., 2019), (ii) Evaluating model
robustness: Analyze the PL model’s
sensitivity to imperceptible perturbations. For example, a small
change in the input programming
language (akin to a typo or a spelling mistake in
the NL scenario) might trigger the code summarization model to
generate a gibberish natural language
code summary (Figure 1), and (iii) Model interpretability: Help
understand what PL models learn.
For example, adversarial samples can be used to
inspect the tokens pre-trained PL models attend to.
A successful adversarial attack for code should
have the following properties: (i) Minimal and imperceptible
perturbations: Akin to spelling mistakes or synonym replacement in NL
that misleads
the neural models, (ii) Code Consistency: Perturbed code is consistent
with the original input,
and (iii) Code fluency: Follows the syntax of the
original programming language. The current NL
adversarial attack models fall short on all three
fronts. Therefore, we propose CodeAttack1
– a simple yet effective black-box attack model
1Code will be made publicly available
arXiv:2206.00052v1 [cs.CL] 31 May 2022
for generating adversarial samples for any input
code snippet, irrespective of the programming language. CodeAttack
operates in a realistic scenario, where the adversary does not have
access to
model parameters but only to the test queries and
the model prediction. CodeAttack uses a pretrained masked CodeBERT PL
model (Feng et al.,
2020) as the adversarial code generator. We leverage the code
structure to generate imperceptible
and effective adversarial attacks through minimal
perturbations constrained to follow the syntax of
the original code. Our primary contributions are as
follows:
• To the best of our knowledge, we are the first
ones to detect the vulnerability of pre-trained programming language
models to adversarial attacks
on different code generation tasks. We propose
a simple yet effective realistic black-box attack
method, CodeAttack, that generates adversarial samples for a code
snippet irrespective of the
input programming language.
• We design a general purpose black-box attack
method for sequence-to-sequence PL models
that is transferable across different downstream
tasks like code translation, repair, and summarization. This can also
be extended to sequenceto-sequence tasks in other domains.
• We demonstrate the effectiveness of
CodeAttack over existing NLP adversarial models through an extensive empirical
evaluation. CodeAttack outperforms the
NLP baselines when considering both the attack
quality and its efficacy.
2 Related Work
Adversarial Attacks in NLP. Adversarial attacks have been used to
analyze the robustness of
NLP models. Black-box adversarial attacks like
BERT-Attack (Li et al., 2020) use BERT, with subword expansion, for
attacking vulnerable words.
BAE (Garg and Ramakrishnan, 2020) also uses
BERT for replacement or insertion around vulnerable words. TextFooler
(Jin et al., 2020) and PWWS
(Ren et al., 2019) use synonyms and part-of-speech
(POS) tagging to replace important tokens. Deepwordbug (Gao et al.,
2018) and TextBugger (Li
et al., 2019) use character insertion, deletion, and
replacement, and constrain their attacks using edit
distance and cosine similarity, respectively. Some
use a greedy search and replacement strategy to
generate adversarial examples (Hsieh et al., 2019;
Yang et al., 2020). Genetic Attack (GA) (Alzantot
et al., 2018) uses genetic algorithm for search with
language model perplexity and word embedding
distance as substitution constraints. Some adversarial models assume
white-box access and use
model gradients to find substitutes for the vulnerable tokens
(Ebrahimi et al., 2018; Papernot et al.,
2016; Pruthi et al., 2019). None of these methods
have been designed specifically for programming
languages, which is more structured than natural
language.
Adversarial Attacks for PL. Yang et al. (2022)
focus on making adversarial examples more natural by using greedy
search and genetic algorithm
for replacement. Zhang et al. (2020) generate adversarial examples by
renaming identifiers using
a Metropolis-Hastings sampling based technique
(Metropolis et al., 1953). Yefet et al. (2020) use
gradient based exploration for attacks. Some also
propose metamorphic transformations to generate
adversarial examples (Applis et al., 2021; Ramakrishnan et al., 2020).
The above models focus on
code understanding tasks like defect detection and
clone detection. Although some works do focus
on generating adversarial examples for code summarization
(Ramakrishnan et al., 2020; Zhou et al.,
2021), they do not talk about the transferability
of these tasks to different tasks and different models. Our model,
CodeAttack, assumes blackbox access to the state-of-the-art PL models
for
generating adversarial attacks for code generation
tasks like code translation, code repair, and code
summarization using a constrained code-specific
greedy algorithm to find meaningful substitutes for
vulnerable tokens.
3 CodeAttack
We describe the capabilities, knowledge, and the
goal of the proposed CodeAttack model, and
provide details on how it detects vulnerabilities in
the state-of-the-art pre-trained PL models.
3.1 Threat Model
Adversary’s Capabilities. The adversary is capable of perturbing the
test queries given as input
to a pre-trained PL model to generate adversarial samples. We follow
the existing literature for
generating natural language adversarial examples
and allow for two types of perturbations for the
input code sequence: (i) token-level perturbations,
and (ii) character-level perturbations. The adversary is allowed to
perturb only a certain number
of tokens/characters and must ensure a high similarity between the
original code and the perturbed
code. Formally, for a given input sequence X ∈ X,
where X is the input space, a valid adversarial example Xadv follows
the requirements:
X 6= Xadv (1)
Xadv ← X + δ; s.t. ||δ|| < θ (2)
Sim(Xadv, X ) ≥   (3)
where θ is the maximum allowed adversarial perturbation; Sim(·) is a
similarity function that takes
into account the syntax of the input code and the
adversarial code sequence; and   is the similarity
threshold. We describe the perturbation constraints
and the similarity functions in more detail in Section 3.2.2.
Adversary’s Knowledge. We assume black-box
access to realistically assess the vulnerabilities and
robustness of existing pre-trained PL models. In
this setting, the adversary does not have access to
the model parameters, model architecture, model
gradients, training data, or the loss function. The
adversary can only query the pre-trained PL model
with input sequences and get their corresponding
output probabilities. This is more practical than a
white-box scenario that assumes access to all the
above, which might not always be the case.
Adversary’s Goal. Given an input code sequence as query, the
adversary’s goal is to degrade the quality of the generated output
sequence
through imperceptibly modifying the query. The
generated output sequence can either be a code
snippet (code translation, code repair) or natural
language text (code summarization). Formally,
given a pre-trained PL model F : X → Y , where
X is the input space, and Y is the output space, the
goal of the adversary is to generate an adversarial
sample Xadv for an input sequence X s.t.
F(Xadv) 6= F(X ) (4)
Q(F(X )) − Q(F(Xadv)) ≥ φ (5)
where Q(·) measures the quality of the generated
output and φ is the specified drop in quality. This is
in addition to the constraints applied on Xadv earlier. We formulate
our final problem of generating
adversarial samples as follows:
∆atk = argmaxδ
[Q(F(X )) − Q(F(Xadv))] (6)
In the above optimization equation, Xadv is a minimally perturbed
adversary subject to constraints
on the perturbations δ (Eqs.1-5). CodeAttack
searches for a perturbation ∆atk to maximize the
difference in the quality Q(·) of the output sequence generated from
the original input code snippet X and that by the perturbed code
snippet Xadv.
3.2 Attack Methodology
CodeAttack’s attack methodology can be broken down into two primary
steps: (i) Finding the
most vulnerable tokens, and (ii) Substituting these
vulnerable tokens (subject to code specific constraints), to generate
adversarial samples.
3.2.1 Finding Vulnerable Tokens
Some input tokens contribute more towards the
final prediction than the others, and therefore, ‘attacking’ these
highly influential or highly vulnerable tokens increases the
probability of altering
the model predictions more significantly as opposed to attacking
non-vulnerable tokens. Since
under a black-box setting, the model gradients
are unavailable and the adversary only has access to the output logits
of the pre-trained PL
model. We define ‘vulnerable tokens’ as tokens
that have a high influence on the output logits of the
model. Let F be an encoder-decoder pre-trained
PL model. The given input sequence is denoted by
X = [x1, .., xi
, ..., xm], where {xi}
m
1
are the input
tokens. The output is a sequence of vectors:
O = F(X ) = [o1, ..., on]
yt = argmax(ot)
where {ot}
n
1
is the output logit for the correct output token yt for the time step
t. Without loss of
generality, we can also assume the output sequence
Y = F(X ) = [yi
, ..., yl
]. Y can either be a sequence of code tokens or natural language tokens.
To find the vulnerable input tokens, we replace a token xi with [MASK]
s.t. X\xi =
[x1, .., xi−1, [MASK], xi+1, .., xm] and get its output logits. The
output vectors are now
O\xi = F(X\xi
) = [o
0
1
, ..., o0
q
]
where {o
0
t}
q
1
is the new output logit for the correct
prediction Y. We calculate the influence score for
the token xi as follows:
Ixi =
Xn
t=1
ot −
X
q
t=1
o
0
t
(7)
Token Class Description
Keywords Reserved word
Identifiers Variable, Class Name, Method name
Arguments Integer, Floating point, String, Character
Operators Brackets ({},(),[]), Symbols (+,*,/,-,%,;,.)
Table 1: Token class and their description.
We rank all the input tokens according to their influence score Ixi
in descending order to find most
vulnerable tokens V . We select only the top-k tokens to limit the
number of perturbations and attack them iteratively either by
completely replacing
them or by adding or deleting a character around
them. We explain this in detail below.
3.2.2 Substituting Vulnerable Tokens
We adopt greedy search using a masked programming language model,
subject to code specific constraints, to find substitutes S for
vulnerable tokens
V , s.t. they are minimally perturbed and have the
maximal probability of incorrect prediction.
Search Method. In a given input sequence, we
mask a vulnerable token vi and use the masked
PL model to predict a meaningful contextualised
token in its place. We use the top-k predictions for
each of the masked vulnerable tokens as our initial
search space. Let M denote a masked PL model.
Given an input sequence X = [x1, .., vi
, .., xm],
where vi
is a vulnerable token, M uses WordPiece
algorithm (Wu et al., 2016) for tokenization that
breaks uncommon words into sub-words resulting
in H = [h1, h2, .., hq]. We align and mask all the
corresponding sub-words for vi
, and combine the
predictions to get the top-k substitutes S
0 = M(H)
for the vulnerable token vi
. This initial search
space S
0
consists of l possible substitutes for a
vulnerable token vi
. We then filter out substitute
tokens to ensure minimal perturbation, code consistency, and code
fluency of the generated adversarial
samples, subject to the following constraints.
Constraints. Since the tokens generated from a
masked PL model may not be meaningful individual code tokens, we
further use a CodeNet tokenizer (Puri et al., 2021) to break a token
into
its corresponsing code tokens. CodeNet tokenizes
the input tokens based on four primary code token
classes as shown in Table 1. If si
is the substitute
for the vulnerable token vi as tokenized by M, and
Op(·) denotes the operators present in any given
token using CodeNet tokenizer, we allow the substitute tokens to have
an extra or a missing operator
(akin to making typos).
|Op(vi)| − 1 ≤ |Op(si)| ≤ |Op(vi)| + 1 (8)
If C(·) denotes the code token classes (identifiers,
keywords, and arguments) of a given token, we
maintain the alignment between between vi and
the potential substitute si as follows.
C(vi) = C(si) and |C(vi)| = |C(si)| (9)
These constraints maintain the syntactic structure
of Xadv and significantly reduce the search space.
Substitutions. We allow two types of substitutions to generate
adversarial examples: (i) Tokenlevel substitution, and (ii) Operator
(character)
level substitution where only an operator is added,
replaced, or deleted. We iteratively substitute the
vulnerable tokens with their corresponding substitute
tokens/characters, using the reduced search
space S, until the adversary’s goal is met.
We only allow replacing p% of the vulnerable tokens/characters to keep
perturbations to a
minimum, where p is a hyper-paramter. We also
maintain the cosine similarity between the input
text X and the adversarially perturbed text Xadv
above a certain threshold (Equation 3). The complete algorithm has
been shown in Algorithm 1.
CodeAttack maintains minimal perturbation,
code fluency, and code consistency between the
input and the adversarial code snippet.
4 Experiments
4.1 Downstream Tasks and Datasets
We show the transferability of CodeAttack
across three different downstream tasks and
datasets – all in different programming languages.
Code Translation involves translating one programming language to the
other. The publicly available code translation datasets2345
consists of parallel functions between Java and C#. There are a
total of 11,800 paired functions, out of which 1000
are used for testing. After tokenization, the average
sequence length for Java functions is 38.51 tokens,
and the average length for C# functions is 46.16.
2
http://lucene.apache.org/
3
http://poi.apache.org/
4
https://github.com/eclipse/jgit/
5
https://github.com/antlr/
Algorithm 1 CodeAttack: Generating adversarial examples for Code
Input: Code X ; Victim model F; Maximum perturbation θ; Similarity  ;
Performance Drop φ
Output: Adversarial Example Xadv
Initialize: Xadv ← X
// Find vulnerable tokens ‘V’
for xi
in M(X ) do
Calculate Ixi
acc. to Eq.(7)
end
V ← Rank(xi) based on Ixi
// Find substitutes ‘S’
for vi
in V do
S ← Filter(vi) subject to Eqs.(8), (9)
for sj in S do
// Attack the victim model
Xadv = [x1, ..., xi−1, sj , ..., xm]
if Q(F(X )) − Q(F(Xadv)) ≥ φ and
Sim(X , Xadv) ≥   and ||Xadv − X || ≤ θ
then
return Xadv // Success
end
end
// One perturbation
Xadv ← [x1, ...xi−1, sj , ..xm]
end
return
Code Repair refines code by automatically fixing bugs. The publicly
available code repair dataset
(Tufano et al., 2019) consists of buggy Java functions as source and
their corresponding fixed functions as target. We use the small subset
of the data
with 46,680 train, 5,835 validation, and 5,835 test
samples (≤ 50 tokens in each function).
Code Summarization involves generating natural language summary for a
given code. We use
the CodeSearchNet dataset (Husain et al., 2019)
which consists of code and their corresponding
summaries in natural language. We show the results of our model on
Python (252K/14K/15K),
Java (165K/5K/11K), and PHP (241K/13K/15K).
The numbers in the bracket denote the approximate
samples in train/development/test set, respectively.
4.2 Victim Models
We pick a representative method from different
categories as our victim models to attack.
• CodeT5 (Wang et al., 2021): A unified pretrained encoder-decoder
transformer-based PL
model that leverages code semantics by using
an identifier-aware pre-training objective. This
is the state-of-the-art on several sub-tasks in the
CodeXGlue benchmark (Lu et al., 2021).
• CodeBERT (Feng et al., 2020): A bimodal pretrained programming
language model that performs code-code and code-nl tasks.
• GraphCodeBert (Guo et al., 2020): Pre-trained
graph programming language model that leverages code structure through
data flow graphs.
• RoBERTa (Liu et al., 2019): Pre-trained natural language model with
state-of-art results on
GLUE (Wang et al., 2018), RACE (Lai et al.,
2017), and SQuAD (Rajpurkar et al., 2016)
datasets.
For our experiments, we use the publicly available fine-tuned
checkpoints for CodeT5 and finetune CodeBERT, GraphCodeBERT, and
RoBERTa
on the related downstream tasks.
4.3 CodeAttack Configurations
The proposed CodeAttack model is implemented in PyTorch. For the
purpose of our experiments, we use the publicly available pre-trained
CodeBERT (MLM) masked PL model as the adversarial example generator.
We select the top 50
predictions for each vulnerable token as the initial
search space. On an average, we only attack at
2 to 4 vulnerable tokens for all the tasks to keep
the perturbations to a minimum. The cosine similarity threshold
between the original code snippet and adversarially generated code is
0.5. Since
CodeAttack does not require any training, we
attack the victim models on the test set using a
batch-size of 256. All experiments were conducted
on a 48 GiB RTX 8000 GPU.
4.4 Evaluation Metric
Downstream Performance. We measure the
downstream performance using CodeBLEU (Ren
et al., 2020) and BLEU (Papineni et al., 2002) before and after the
attack. CodeBLEU measures
the quality of the generated code snippet for code
translation and code repair, and BLEU measures
the quality of the generated natural language code
summary. To measure the efficacy of the attack
model, we define
∆drop = Qbefore − Qafter = Q(Y) − Q(Yadv)
where Q = {CodeBLEU, BLEU}, Y is the output
sequence generated from the original code X , and
Task Victim
Model
Attack
Model
Downstream Performance Attack Quality Overall
Before After ∆ (GMean) drop Attack% #Query CodeBLEUq
Translate
(CodeCode)
CodeT5
TextFooler
73.99
68.08 5.91 28.29 94.95 63.19 21.94
BERT-Attack 48.59 25.40 83.12 186.1 51.11 47.61
CodeAttack 61.72 12.27 89.3 36.84 65.91 41.64
CodeBERT
TextFooler
71.16
60.45 10.71 49.2 73.91 66.61 32.74
BERT-Attack 58.80 12.36 97.1 48.76 59.90 41.58
CodeAttack 54.14 17.03 97.7 26.43 66.89 48.09
GraphCodeBERT
Textfooler
66.80
46.51 20.29 38.70 83.17 63.62 36.83
BERT-Attack 36.54 30.26 97.33 41.30 57.41 55.30
CodeAttack 38.81 27.99 98 20.60 65.39 56.39
Repair
(CodeCode)
CodeT5
Textfooler
61.13
57.59 3.53 58.84 90.50 69.53 24.36
BERT-Attack 52.70 8.42 98.3 74.99 55.94 35.79
CodeAttack 53.21 7.92 99.36 30.68 69.03 37.87
CodeBERT
Textfooler
61.33
53.55 7.78 81.61 45.89 68.16 35.11
BERT-Attack 51.95 9.38 98.3 74.99 55.94 37.22
CodeAttack 52.02 9.31 99.39 25.98 68.05 39.78
GraphCodeBERT
Textfooler
62.16
54.23 7.92 78.92 51.07 67.89 34.89
BERT-Attack 53.33 8.83 99.4 62.59 56.05 36.64
CodeAttack 51.97 10.19 99.52 24.67 66.16 40.63
Summarize
(Code-NL)
CodeT5
TextFooler
20.06
14.96 5.70 64.6 410.15 53.91 27.08
BERT-Attack 11.96 8.70 90.4 1006.28 51.34 34.30
CodeAttack 11.06 9.59 82.8 314.87 52.67 34.71
CodeBERT
Textfooler
19.76
14.38 5.37 61.1 358.43 54.10 26.10
BERT-Attack 11.30 8.35 93.74 695.03 50.31 34.16
CodeAttack 10.88 8.87 88.32 204.46 52.95 34.62
RoBERTa
TextFooler
19.06
14.06 4.99 62.6 356.68 54.11 25.67
BERT-Attack 11.34 7.71 94.15 701.01 50.10 33.14
CodeAttack 10.98 8.08 87.51 183.22 53.03 33.47
Table 2: Results for adversarial attack on translation (C#-Java),
repair (Java-Java), and summarization (PHP)
tasks. The downstream performance for Code-Code tasks is measured in
CodeBLEU; and for Code-NL task in
BLEU. The best result is in boldface; the next best is underlined.
Overall CodeAttack outperforms significantly
(p < 0.05).
Yadv is the sequence generated from the perturbed
code Xadv.
Attack Quality. We automatically measure the
attack quality using the following.
• Attack %: Computes the % of successful attacks
as measured by the ∆drop. Higher the value,
more successful the attack.
• # Query: Under a black-box setting, the adversary can query the
victim model to check for
changes in the output logits. Lower the average
number of queries required per sample, more
efficient the adversary.
• # Perturbation: The number of tokens perturbed on average to
generate an adversarial code.
Lower the value, more imperceptible the attack.
To measure the quality of the perturbed code, we
calculate CodeBLEUq = CodeBLEU(X , Xadv).
Higher the CodeBLEUq, better the quality of the
adversarial code. Since we want the ∆drop to be
as high as possible while maintaining the attack
% and CodeBLEUq, we measure the geometric
mean (GMean) between ∆drop, attack%, and the
CodeBLEUq to measure the overall performance.
4.5 Results
The results for attacking pre-trained PL models
for (i) Code Translation, (ii) Code Repair, and (iii)
Code Summarization are shown in Table 2. Due
to lack of space, we only show results for the C#
to Java translation task and for the PHP code summarization task
(refer to Appendix A for results
on Java-C# translation and code summarization
results on Python and Java). We use the metrics described in Section
4.4 and compare our model with
two state-of-the-art adversarial NLP baselines: (i)
TextFooler (Jin et al., 2020), and (ii) BERT-Attack
(Li et al., 2020).
Downstream Performance Drop. The average
∆drop using CodeAttack is at least 20% for code
Original Code TextFooler BERT-Attack CodeAttack
public string GetFullMessage
() {
...
if (msgB < 0){return string
.Empty;}
...
return RawParseUtils.Decode
(enc, raw, msgB, raw.
Length);
}
citizenship string
GetFullMessage() {
...
if (msgB < 0){return string
.Empty;}
...
return RawParseUtils.Decode
(enc, raw, msgB, raw.
Length);
}
loop string GetFullMessage()
{
...
if (msgB < 0){return string
.Empty;}
...
return [UNK][UNK].[UNK](x)
raw, msgB, raw.Length
);
}
public string GetFullMessage
() {
...
if (msgB = 0){return string
.Empty;}
...
return RawParseUtils.Decode
(enc, raw, msgB, raw.
Length);
}
CodeBLEUbefore: 77.09 ∆drop: 18.84; CodeBLEUq: 95.11 ∆drop: 15.09;
CodeBLEUq: 57.46 ∆drop: 21.04; CodeBLEUq: 88.65
public override void
WriteByte(byte b) {
if (outerInstance.upto ==
outerInstance.
blockSize) {
... }
}
audiences revoked canceling
WriteByte(byte b) {
if (outerInstance.upto ==
outerInstance.
blockSize) {
.... }
}
public override void
[UNK][UNK]() b) {
if (outerInstance.upto ==
outerInstance.
blockSize) {
... }
}
public override void
WriteByte((bytes b) {
if (outerInstance.upto ==
outerInstance.
blockSize) {
... }
}
CodeBLEUbefore:100 ∆drop:5.74; CodeBLEUq: 63.28 ∆drop:27.26;
CodeBLEUq:49.87 ∆drop:20.04; CodeBLEUq: 91.69
Table 3: Qualitative examples of perturbed codes using TextFooler,
BERT-Attack, and CodeAttack on Code
Translation task.
(a) CodeBLEUafter (b) CodeBLEUq (c) Average #Query (d) Attack%
Figure 2: Effectiveness of the attack models on CodeT5 for the code
translation task (C#-Java).
translation task and 10% for both code repair task
and code summarization tasks for all three pretrained models. ∆drop is
higher for BERT-Attack
for translation and repair tasks but its attack quality
(described later) is the lowest. CodeAttack has
the best ∆drop for summarization.
Attack Quality. We observe that CodeAttack
has the highest attack success % for code translation and the code
repair tasks; and the second
best success rate for the code summarization task.
CodeAttack is more efficient as it has the lowest average query number
per sample. This shows
that it successfully attacks more samples with less
querying. Table 3 presents some qualitative examples of the generated
adversarial code snippets
from different attack models. TextFooler has the
best CodeBLEUq (as seen in Table 2) but it replaces keywords with
closely related natural language words (‘public’:
‘citizenship’/‘audiences’;
‘override’: ‘revoked’, ‘void’: ‘cancelling’). BERTAttack has the
lowest CodeBLEUq and substitutes
tokens with either a special ‘[UNK]’ token or with
other seemingly random words. This is expected
since both TextFooler and BERT-Attack have not
been designed for programming languages. On
the other hand, although CodeAttack has the
second best CodeBLEUq, it generates more meaningful adversarial
samples by replacing variables
and operators which are imperceptible.
Effectiveness. To study the effectiveness of
CodeAttack, we limit the # perturbations. (Figure 2). From Figure 2a,
we observe that as the
perturbation % increases, the CodeBLEUafter for
CodeAttack decreases but remains constant for
TextFooler and slightly increases for BERT-Attack.
We also observe that although CodeBLEUq for
CodeAttack is the second best (Figure 2b), it
has the highest attack success rate (Figure 2d)
and the lowest number of required queries (Figure 2c) throughout. This
shows the efficiency of
CodeAttack and the need for code specific adversarial attacks.
Overall Performance. Overall, CodeAttack
has the best performance when we consider the
geometric mean (GMean) between ∆drop, attack
%, and CodeBLEUq together. These results are
generalizable across different input programming
languages and different downstream tasks (C# in
case of code translation; Java in case of code repair,
PHP in case of code summarization).
4.6 Ablation Study
We conduct an ablation study to evaluate the importance of selecting
vulnerable tokens (V) and
applying constraints (C) to maintain the syntax of
(a) Performance Drop (b) CodeBLEUq (c) # Query (d) Average Success Rate
Figure 3: Ablation Study for Code Translation (C#-Java): Performance
of CodeAttack with (+) and without (-)
the vulnerable tokens (V) and the two constraints (C): (i) Operator
level (C1), and (ii) Token level (C2).
the perturbed code. Figure 3 shows the results for
the ablation study on the code translation task from
C#-Java. See Appendix A for qualitative examples.
Importance of Vulnerable Tokens. We define
a variant, CodeAttack+V-C, which finds vulnerable tokens based on
logit information (Section 3.2.1) and subsitutes them, albeit without
any constraints. We create another variant,
CodeAttack-V-C, which randomly samples tokens from the input code to
attack. As can be seen
from Figure 3a, the latter attack is not as effective as the ∆drop is
less than the former for the
same CodeBLEUq (Figure 3b) and the attack%
(Figure 3d).
Importance of Constraints. We substitute the
vulnerable tokens using the predictions from a
masked PL model with (+C) and without (-C)
any code specific constraints. We apply two
types of constraints: (i) Operator level constraint (CodeAttack+V+C1),
and (ii) Token
level constraint (CodeAttack +V+C1+C2) (Section 3.2.2). Only applying
the first constraint results in lower attack success % (Figure 3d) and
∆drop (Figure 3a) but a much higher CodeBLEUq.
On applying both the constraints together, the
∆drop and the attack success % improve. Overall,
the final model, CodeAttack+V+C1+C2, has
the best tradeoff between the ∆drop, attack success
%, CodeBLEUq, and #Queries required.
Human Evaluation. We sample 50 original and
perturbed Java and C# code samples and shuffle
them to create a mix. We ask 3 human annotators,
familiar with the two programming languages, to
classify the code as either original or adversarial.
We also ask them to rate the syntactic correctness of
the codes on a scale of 1 to 5; where 1 is completely
incorrect syntax; and 5 is the perfect syntax. On
an average, 72.10% of the codes were classified
as original and the average syntactic correctness
was 4.14 for the adversarial code. Additionally, we
provided the annotators with pairs of original and
adversarial codes and asked them to rate the ’visual’
similarity between them using 0 to 1; where 0 is
not similar at all, 0.5 is somewhat similar, and 1 is
very similar. On average, the similarity was 0.71.
5 Discussions and Limitations
We observe that it is easier to attack the code translation task than
the code repair or code summarization tasks. Since code repair aims to
fix bugs
in the given code snippet, attacking it is more
challenging. For code summarization, the BLEU
score drops by almost 50%. For all three tasks,
CodeT5 is the most robust whereas GraphCodeBERT is the most
susceptible to attacks using
CodeAttack. CodeT5 has been pre-trained on
the task of Masked Identifier Prediction or deobsfuction (Lachaux et
al., 2021) where changing the
identifier names does not have an impact on the
code semantics. This helps the model avoid the attacks which involve
changing the identifier names,
and in turn makes it more robust. GraphCodeBERT,
on the other hand, uses data flow graphs in their pretraining which
relies on the predicting the relationship between the identifiers.
Since CodeAttack
modifies the identifiers and perturbs the relationship between them,
it proves extremely effective
on GraphCodeBERT. This results in a more significant ∆drop on
GraphCodeBERT compared to other
models for the code translation task.
CodeAttack, although effective has a few limitations. These
adversarial attacks can be avoided if
the pre-trained models choose to compile the input
code before processing. The PL models can also
be made more robust by either additionally pretraining or fine-tuning
them using the generated
adversarial examples. Incorporating more tasks
such as code obfuscation in the pre-training stage
might also help with the robustness of the models.
6 Conclusion
We introduce a black-box adversarial attack model,
CodeAttack, to detect vulnerabilities of the
state-of-the-art programming language models.
CodeAttack finds the most vulnerable tokens
in the given code snippet and uses a greedy search
mechanism to identify contextualised substitutes
subject to code-specific constraints. Our model incorporates the
syntactic information of the input
code to generate adversarial examples that are effective,
imperceptible, maintain code fluency, and
consistency. We perform an extensive empirical
and human evaluation to demonstrate the transferability of CodeAttack
on several code-code and
code-NL tasks across different programming languages. CodeAttack
outperforms the existing
state-of-the-art adversarial NLP models when both
the performance drop and the attack quality are
taken together. CodeAttack uses fewer queries
and is more efficient, highlighting the need for
code-specific adversarial attacks.
References
Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, and
Kai-Wei Chang. 2021. Unified pre-training for program understanding
and generation. In Proceedings
of the 2021 Conference of the North American Chapter of the
Association for Computational Linguistics:
Human Language Technologies, pages 2655–2668.
Moustafa Alzantot, Yash Sharma, Ahmed Elgohary,
Bo-Jhang Ho, Mani Srivastava, and Kai-Wei Chang.
2018. Generating natural language adversarial examples. In Proceedings
of the 2018 Conference on
Empirical Methods in Natural Language Processing,
pages 2890–2896.
Leonhard Applis, Annibale Panichella, and Arie van
Deursen. 2021. Assessing robustness of ml-based
program analysis tools using metamorphic program
transformations. In 2021 36th IEEE/ACM International Conference on
Automated Software Engineering (ASE), pages 1377–1381. IEEE.
Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing
Dou. 2018. Hotflip: White-box adversarial examples for text
classification. In Proceedings of the
56th Annual Meeting of the Association for Computational Linguistics
(Volume 2: Short Papers), pages
31–36.
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming
Gong, Linjun Shou, Bing Qin,
Ting Liu, Daxin Jiang, et al. 2020. Codebert: A
pre-trained model for programming and natural languages. In Findings
of the Association for Computational Linguistics: EMNLP 2020, pages
1536–1547.
Ji Gao, Jack Lanchantin, Mary Lou Soffa, and Yanjun Qi. 2018.
Black-box generation of adversarial
text sequences to evade deep learning classifiers. In
2018 IEEE Security and Privacy Workshops (SPW),
pages 50–56. IEEE.
Siddhant Garg and Goutham Ramakrishnan. 2020.
Bae: Bert-based adversarial examples for text classification. In
Proceedings of the 2020 Conference on
Empirical Methods in Natural Language Processing
(EMNLP), pages 6174–6181.
Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu
Tang, LIU Shujie, Long Zhou, Nan Duan, Alexey
Svyatkovskiy, Shengyu Fu, et al. 2020. Graphcodebert: Pre-training
code representations with data
flow. In International Conference on Learning Representations.
Yu-Lun Hsieh, Minhao Cheng, Da-Cheng Juan, Wei
Wei, Wen-Lian Hsu, and Cho-Jui Hsieh. 2019. On
the robustness of self-attentive models. In Proceedings of the 57th
Annual Meeting of the Association
for Computational Linguistics, pages 1520–1529.
Hamel Husain, Ho-Hsiang Wu, Tiferet Gazit, Miltiadis
Allamanis, and Marc Brockschmidt. 2019. Codesearchnet challenge:
Evaluating the state of semantic code search. arXiv preprint
arXiv:1909.09436.
Di Jin, Zhijing Jin, Joey Tianyi Zhou, and Peter
Szolovits. 2020. Is bert really robust? a strong baseline for natural
language attack on text classification
and entailment. In Proceedings of the AAAI conference on artificial
intelligence, volume 34, pages
8018–8025.
Marie-Anne Lachaux, Baptiste Roziere, Marc
Szafraniec, and Guillaume Lample. 2021. Dobf: A
deobfuscation pre-training objective for programming languages.
Advances in Neural Information
Processing Systems, 34.
Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang,
and Eduard Hovy. 2017. Race: Large-scale reading
comprehension dataset from examinations. In Proceedings of the 2017
Conference on Empirical Methods in Natural Language Processing, pages
785–
794.
J Li, S Ji, T Du, B Li, and T Wang. 2019. Textbugger:
Generating adversarial text against real-world applications. In 26th
Annual Network and Distributed
System Security Symposium.
Linyang Li, Ruotian Ma, Qipeng Guo, Xiangyang Xue,
and Xipeng Qiu. 2020. Bert-attack: Adversarial attack against bert
using bert. In Proceedings of the
2020 Conference on Empirical Methods in Natural
Language Processing (EMNLP), pages 6193–6202.
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi
Chen, Omer Levy, Mike Lewis,
Luke Zettlemoyer, and Veselin Stoyanov. 2019.
Roberta: A robustly optimized bert pretraining approach. arXiv
preprint arXiv:1907.11692.
Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey
Svyatkovskiy, Ambrosio Blanco, Colin B. Clement,
Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou,
Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel
Sundaresan, Shao Kun Deng, Shengyu Fu, and Shujie
Liu. 2021. Codexglue: A machine learning benchmark dataset for code
understanding and generation.
CoRR, abs/2102.04664.
Nicholas Metropolis, Arianna W Rosenbluth, Marshall N Rosenbluth,
Augusta H Teller, and Edward
Teller. 1953. Equation of state calculations by
fast computing machines. The journal of chemical
physics, 21(6):1087–1092.
Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian
Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma,
Tom Brown, Aurko Roy,
et al. 2016. Technical report on the cleverhans
v2. 1.0 adversarial examples library. arXiv preprint
arXiv:1610.00768.
Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002.
Bleu: a method for automatic evaluation of machine translation. In
Proceedings of
the 40th Annual Meeting of the Association for Computational
Linguistics, pages 311–318, Philadelphia,
Pennsylvania, USA. Association for Computational
Linguistics.
Danish Pruthi, Bhuwan Dhingra, and Zachary C Lipton. 2019. Combating
adversarial misspellings with
robust word recognition. In Proceedings of the
57th Annual Meeting of the Association for Computational Linguistics,
pages 5582–5591.
Ruchir Puri, David S Kung, Geert Janssen, Wei
Zhang, Giacomo Domeniconi, Vladmir Zolotov, Julian Dolby, Jie Chen,
Mihir Choudhury, Lindsey
Decker, et al. 2021. Project codenet: a large-scale
ai for code dataset for learning a diversity of coding
tasks.
Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and
Percy Liang. 2016. SQuAD: 100,000+ questions for
machine comprehension of text. In Proceedings of
the 2016 Conference on Empirical Methods in Natural Language
Processing, pages 2383–2392, Austin,
Texas. Association for Computational Linguistics.
Goutham Ramakrishnan, Jordan Henkel, Zi Wang,
Aws Albarghouthi, Somesh Jha, and Thomas Reps.
2020. Semantic robustness of models of source code.
arXiv preprint arXiv:2002.03043.
Shuhuai Ren, Yihe Deng, Kun He, and Wanxiang Che.
2019. Generating natural language adversarial examples through
probability weighted word saliency.
In Proceedings of the 57th Annual Meeting of the
Association for Computational Linguistics, pages
1085–1097, Florence, Italy. Association for Computational Linguistics.
Shuo Ren, Daya Guo, Shuai Lu, Long Zhou, Shujie
Liu, Duyu Tang, Neel Sundaresan, Ming Zhou, Ambrosio Blanco, and Shuai
Ma. 2020. Codebleu: a
method for automatic evaluation of code synthesis.
arXiv preprint arXiv:2009.10297.
Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta,
Martin White, and Denys Poshyvanyk. 2019. An empirical study on
learning bugfixing patches in the wild via neural machine translation.
ACM Transactions on Software Engineering
and Methodology (TOSEM), 28(4):1–29.
Alex Wang, Amanpreet Singh, Julian Michael, Felix
Hill, Omer Levy, and Samuel Bowman. 2018. Glue:
A multi-task benchmark and analysis platform for
natural language understanding. In Proceedings
of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting
Neural Networks for NLP,
pages 353–355.
Yue Wang, Weishi Wang, Shafiq Joty, and Steven CH
Hoi. 2021. Codet5: Identifier-aware unified pretrained encoder-decoder
models for code understanding and generation. In Proceedings of the
2021
Conference on Empirical Methods in Natural Language Processing, pages 8696–8708.
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V
Le, Mohammad Norouzi, Wolfgang Macherey,
Maxim Krikun, Yuan Cao, Qin Gao, Klaus
Macherey, et al. 2016. Google’s neural machine
translation system: Bridging the gap between human and machine
translation. arXiv preprint
arXiv:1609.08144.
Puyudi Yang, Jianbo Chen, Cho-Jui Hsieh, Jane-Ling
Wang, and Michael I Jordan. 2020. Greedy attack
and gumbel attack: Generating adversarial examples
for discrete data. J. Mach. Learn. Res., 21(43):1–36.
Zhou Yang, Jieke Shi, Junda He, and David Lo. 2022.
Natural attack for pre-trained models of code. arXiv
preprint arXiv:2201.08698.
Noam Yefet, Uri Alon, and Eran Yahav. 2020. Adversarial examples for
models of code. Proceedings of the ACM on Programming Languages,
4(OOPSLA):1–30.
Huangzhao Zhang, Zhuo Li, Ge Li, Lei Ma, Yang Liu,
and Zhi Jin. 2020. Generating adversarial examples for holding
robustness of source code processing models. In Proceedings of the
AAAI Conference
on Artificial Intelligence, volume 34, pages 1169–
1176.
Yaqin Zhou, Shangqing Liu, Jingkai Siow, Xiaoning
Du, and Yang Liu. 2019. Devign: Effective vulnerability identification
by learning comprehensive
program semantics via graph neural networks. Advances in neural
information processing systems, 32.
Yu Zhou, Xiaoqing Zhang, Juanjuan Shen, Tingting
Han, Taolue Chen, and Harald Gall. 2021. Adversarial robustness of
deep code comment generation.
arXiv preprint arXiv:2108.00213.
A Appendix
A.1 Results
Downstream Performance and Attack Quality
We measure the BLEU, ∆BLEU , EM ∆EM , CodeBLEU, and ∆CodeBLEU to
measure the downstream performance for code-code tasks (code repair
and code translation). The programming languages used are C#-Java and
Java-C# for translation tasks; and Java for code repair tasks (Table 4
and Table 5). We measure code-NL task for code
summarization in BLEU and ∆BLEU . We show the
results for three programming languages: Python,
Java, and PHP (Table 6). We measure the quality of
the attacks using the metric defined in 4.4 and additionally show
BLEUq which measures the BLEU
score between the original and the perturbed code.
The results follow a similar pattern as that seen in
Section 4.5.
Ablation Study: Qualitative Analysis Table 7
shows the adversarial examples generated using the
variants described in Section 4.6.
Task Victim Attack BLEU (∆BLEU ) EM (∆EM) CodeBLEU (∆CB)
Java-C#
CodeT5
Original 85.37 67.9 87.03
TextFooler 77.47 (7.9) 47.0 (20.9) 79.83 (7.19)
BERT-Attack 59.38 (25.99) 13.2 (54.7) 66.92 (20.11)
CodeAttack 64.85 (20.52) 5.2 (62.7) 68.81 (18.21)
CodeBERT
Original 81.81 62.5 83.48
TextFooler 71.26 (10.55) 30.4 (32.1) 73.52 (9.95)
BERT-Attack 54.25 (27.56) 3.1 (59.4) 54.69 (28.79)
CodeAttack 65.31* (16.5) 9 (53.5) 66.99* (16.49)
GraphCodeBERT
Original 80.35 59.4 82.4
TextFooler 72.15 (8.2) 35.24 (24.16) 74.32 (8.07)
BERT-Attack 54.54 (25.81) 2.4 (57) 54.47 (27.93)
CodeAttack 62.35* (18) 9.4 (50) 64.87 (17.52)
C#-Java
CodeT5
Original 81.54 70.6 73.99
TextFooler 72.62 (8.92) 50 (20.6) 68.08 (5.91)
BERT-Attack 45.71 (35.83) 15.54 (55.06) 48.59 (25.40)
CodeAttack 58.05 (23.49) 11 (59.6) 61.72 (12.27)
CodeBERT
Original 77.24 62 71.16
TextFooler 67.31 (9.93) 34 (28) 60.45 (10.71)
BERT-Attack 48.74 (28.5) 2 (60) 58.80 (12.36)
CodeAttack 59.41 (17.83) 2.4 (59.6) 54.14 (17.02)
GraphCodeBERT
Original 70.97 56.5 66.80
TextFooler 62.49 (8.48) 37.33 (19.17) 46.51 (20.29)
BERT-Attack 43.2 (55.39) 1.11 (30.26) 36.54 (27.77)
CodeAttack 48.83 (22.14) 2.6 (53.9) 38.81 (27.99)
Repair
CodeT5
Original 78.11 19.9 61.13
TextFooler 73.23 (4.88) 1.4(18.5) 57.59 (3.53)
BERT-Attack 65.86 (12.25) 1.2 (18.7) 52.70 (8.42)
CodeAttack 66.1 (12.01) 1.28 (18.62) 53.21 (7.92)
CodeBERT
Original 78.66 15.64 61.33
TextFooler 67.06 (11.6) 4.48 (11.16) 53.55 (7.78)
BERT-Attack 60.52 (18.14) 3.4 (12.24) 51.95 (9.38)
CodeAttack 61.23 (17.43) 4.05 (11.59) 52.02 (9.31)
GraphCodeBERT
Original 79.73 15.05 62.16
TextFooler 66.19 (13.54) 2.5 (12.55) 54.23 (7.92)
BERT-Attack 64.1 (15.63) 3.5 (11.55) 53.33 (8.83)
CodeAttack 64.7 (15.03) 4.38 (10.67) 51.97 (10.19)
Table 4: Downstream Performance: Code-Code Tasks. Best result in bold
and the next best is undelined.
Task Victim Attack Attack% Query# BLEUq CodeBLEUq
Java-C#
CodeT5
TextFooler 32.3 62.9 78.95 81.28
BERT-Attack 85.6 112.5 54.95 69.48
CodeAttack 94.8 19.85 68.08 75.21*
CodeBERT
TextFooler 55.9 38.57 82.75 83.93
BERT-Attack 95.39 46.09 54.75 76.18
CodeAttack 91.1 24.42 66.89 76.77*
GraphCodeBERT
TextFooler 51.21 39.33 82.41 82.45
BERT-Attack 96.2 38.29* 53.46 73.55
CodeAttack 90.8* 23.22 68.42 77.33
C#-Java
CodeT5
TextFooler 28.29 94.95* 69.47 63.19*
BERT-Attack 83.12 186.1 41.45 51.11
CodeAttack 89.3 36.84 67.09 65.91
CodeBERT
TextFooler 49.2 73.91 73.19 66.61
BERT-Attack 97.1 48.76 52.33 59.90
CodeAttack 97.7 26.43 66.24 66.89
GraphCodeBERT
TextFooler 38.70 83.17 70.61 63.62
BERT-Attack 97.33 41.30 55.71 57.41
CodeAttack 98 20.60 68.07 65.39
Repair
CodeT5
TextFooler 58.84 90.50 93.37 69.53
BERT-Attack 98.1 121.1 80.56 55.49
CodeAttack 99.36 30.68 88.95* 69.03*
CodeBERT
TextFooler 81.61 45.89 91.82 68.16
BERT-Attack 98.3 74.99 80.7 55.94
CodeAttack 99.39 25.98 87.83 68.05
GraphCodeBERT
TextFooler 78.92 51.07 91.4 67.89
BERT-Attack 99.4 62.59 81.73 56.05
CodeAttack 99.52 24.67 85.68 66.16
Table 5: Attack Quality: Code-Code Tasks. Best result in bold and the
next best is underlined.
Task Victim Attack BLEU (∆BLEU ) Attack% Query# BLEUq CodeBLEUq
Java
CodeT5
Original 19.77
TextFooler 14.06 (5.71) 67.8 291.82 75.33 92.82
BERT-Attack 11.94 (7.82) 93.34 541.43 54.47 48.35
CodeAttack 11.21 (8.56) 80.8 198.11 68.51 90.04
CodeBERT
Original 17.65
TextFooler 16.84 (1.20) 42.4 400.78 65.88 90.29
BERT-Attack 12.87 (4.77) 84.6 826.71 34.51 83.82
CodeAttack 14.69 (2.85) 73.7 340.99 59.74 59.37
RoBERTa
Original 16.47
TextFooler 13.23 (3.23) 44.9 383.36 67.9 90.87
BERT-Attack 12.67 (3.8) 72.93 901.01 28.18 42.09
CodeAttack 11.74(4.73) 50.14 346.07 32.63 48.48
PHP
CodeT5
Original 20.66
TextFooler 14.96 (5.70) 64.6 410.15 78.11 53.91
BERT-Attack 11.96 (8.70) 90.4 1006.28 49.3 51.34
CodeAttack 11.06 (9.59) 82.8 314.87 66.02 52.67
CodeBERT
Original 19.76
TextFooler 14.38 (5.37) 61.1 358.43 79.81 54.10
BERT-Attack 11.30 (8.45) 93.74 695.03 50.77 50.31
CodeAttack 10.88 (8.87) 88.32 204.46 69.11 52.95
RoBERTa
Original 19.06
TextFooler 14.06 (4.99) 62.6 356.68 79.73 54.11
BERT-Attack 11.34 (7.71) 94.15 701.01 51.49 50.10
CodeAttack 10.98 (8.08) 87.51 183.22 70.33 53.03
Python
CodeT5
Original 20.26
TextFooler 12.11 (8.24) 90.47 400.06 86.84 77.59
BERT-Attack 8.22 (12.13) 99.81 718.07 77.11 64.66
CodeAttack 7.79 (12.38) 98.50 174.05 87.04 69.17
CodeBERT
Original 78.66
TextFooler 20.76 (5.40) 68.5 966.19 76.56 75.15
BERT-Attack 18.95 (7.21) 93.72 1414.67 55.22 52.31
CodeAttack 18.69 (7.47) 86.63 560.58 63.84 59.11
RoBERTa
Original 17.01
TextFooler 10.72 (6.29) 63.34 788.25 70.48 74.05
BERT-Attack 10.66 (6.35) 89.64 1358.85 51.74 56.75
CodeAttack 9.5 (7.51) 76.09 661.75 55.45 61.22
Table 6: Downstream Performance and Attack Quality on Code-NL
(Summarization) Task for different programming languages. Best result
in bold and the next best is underlined.
Original Code CodeAttack+V-C CodeAttack+V+C1 CodeAttack+V+C1+C2
public void AddMultipleBlanks
(MulBlankRecord mbr) {
for (int j = 0; j < mbr.
NumColumns; j++) {
BlankRecord br = new
BlankRecord();
br.Column = j + mbr.
FirstColumn;
br.Row = mbr.Row;
br.XFIndex = (mbr.GetXFAt
(j));
InsertCell(br);
}
}
((void AddMultipleBlanks(
MulBlankRecord mbr) {
for (int j ? 0; j < mbr
.NumColumns; j++)
{
BlankRecord br = new
BlankRecord();
br.Column = j + mbr.
FirstColumn;
br.Row = mbr.Row;
br.XFIndex = (mbr.
GetXFAt(j));
InsertCell(br);
}
}
public void AddMultipleBlanks
(MulBlankRecord mbr) {
for (int j > 0; j < mbr.
NumColumns; j++) {
BlankRecord br = -new
BlankRecord();
br.Column = j + mbr.
FirstColumn;
br.Row = mbr.Row;
br.XFIndex > (mbr.GetXFAt
(j));
InsertCell(br);
}
}
static void AddMultipleBlanks
(MulBlankRecord mbr) {
for (int j > 0; jj < mbr.
NumColumns; j++) {
BlankRecord br = new
BlankRecord();
br.Column = j + mbr.
FirstColumn;
br.Row = mbr.Row;
br.XFIndex = (mbr.GetXFAt
(j));
InsertCell(br);
}
}
CodeBLEUbefore: 76.3 ∆drop: 7.21; CodeBLEUq: 43.85 ∆drop: 5.85;
CodeBLEUq: 69.61 ∆drop: 12.96; CodeBLEUq: 59.29
public string GetFullMessage
() {
byte[] raw = buffer;
int msgB = RawParseUtils.
TagMessage(raw, 0);
if (msgB < 0) {
return string.Empty;
}
Encoding enc =
RawParseUtils.
ParseEncoding(raw);
return RawParseUtils.Decode
(enc, raw, msgB, raw.
Length);
}
˘0120public string
GetFullMessage() {
byte[] raw = buffer;
int msgB = RawParseUtils.
TagMessage(raw, 0);
if (msgB < 0) {
return string.Empty;
}
Encoding enc =
RawParseUtils.
ParseEncoding(raw);
return RawParseUtils.Decode
(enc, RAW.., msgB,
raw.Length);
}
public string GetFullMessage
() {
byte[] raw = buffer;
int msgB = RawParseUtils.
TagMessage(raw, 0);
if (msgB = 0) {
return string.Empty;
}
Encoding enc =
RawParseUtils.
ParseEncoding(raw);
return RawParseUtils.Decode
(enc, raw, msgB, raw.
Length);
}
static string GetFullMessage
() {
byte[] raw = buffer;
int msgB = RawParseUtils.
TagMessage(raw, 0);
if (msgB < 0 {
return string.Empty;
}
Encoding enc =
RawParseUtils.
ParseEncoding(raw);
return RawParseUtils.Decode
(enc, raw, MsgB,raw.
Length);
}
CodeBLEUbefore:77.09 ∆drop: 10.42; CodeBLEUq: 64.19 ∆drop: 21.93;
CodeBLEUq: 87.25 ∆drop: 22.8; CodeBLEUq: 71.30
Table 7: Qualitative examples for the ablation study on CodeAttack:
Attack vulnerable tokens (V) without any
constraints (-C), with operator level constraints (+C1), and with
token level (+C2) contraints on code translation
task.


More information about the cypherpunks mailing list