Adversarial camera stickers: A physical camera-based attack on deep learning systems


JunchengB.Li12 FrankR.Schmidt1 J.ZicoKolter12
1Bosch Center for Arti?cial Intelligence
2School of Computer Science, Carnegie Mellon University, Pittsburgh, USA.




Recent work has documented the susceptibility of deep learning systems to adversarial examples, but most such attacks directly manipulate the digital input to a classi?er.






In the majority of the cases studied, however, these attacks are considered in a“purely digital”domain,

A smaller but still substantial line of work has emerged to show that these attacks can also transfer to the physical world:

In all these cases, though, the primary mode of attack has been manipulating the object of interest, often with very visually apparent perturbations. Compared to attacks in the digital space, physical attacks have not been explored to its full extent: we still lack base lines and feasible threat models that work robustly in reality. (引言部分的转引)




The overall procedure consists of three main contributions and improvements over past work:


Our experiments show that,在真实的视频数据中,通过物理制作的贴纸,我们可以在5种不同的类/目标类组合中实现52%的平均目标欺骗率,并进一步将分类器的准确率降低到27%。

In total, we believe this work substantially adds to the recent crucial considerations of the “right” notion of adversarial threat models, demonstrating that these pose physical risk when an attacker can access a camera. 意义。

2.Background and Related Works

This paper will focus on the so-called white-box attack setting, where we assume access to the model. In this setting, past work can be roughly categorized into two relevant groups for our work: digital attacks and physical attacks.

2.1Digital Attacks

Digital attacks have been relatively well studied since they first rose to prevalence in the context of deep learning in 2014, where Szegedy et al. (2013) used the box-constrained L-BFGS method to ?nd the perturbation. 自从2014年数字攻击首次在深度学习的背景下流行起来以来,人们对其进行了相对深入的研究,

2.2Physical Attacks

Compared with digital attacks, physically realizable attacks have not been explored to a full extent beyond a few existing works. 与数字攻击相比,物理上可实现的攻击除了现有的一些工作外,还没有得到充分的研究。

However, both these attacks may be hard to deploy in some settings because

  1. they require that each object of interest for where we want to fool the classi?er be explicitly modi?ed; and
  2. they are visually apparent to humans when inspecting the object. 感兴趣目标需要被改变及这些改变视觉可见。

In contrast to the existing methods of physical attacks, our proposed 不干预感兴趣目标且不可见。


3.Crafting Adversarial Stickers
This section contains the main methodological contribution of our paper: the algorithmic and practical pipeline for manufacturing adversarial camera sticker. To begin, we will first describe… We describe our approach… and finally how we adjust the free parameters of our attack to craft adversarial examples. 总括----分述

3.1.A threat model for physical camera sticker attacks

Traditional attacks on neural network models work as follows. Given a classi?er f : X →Y, we want to ?nd some perturbation function π : X →X, such that for any input x ∈X, π(x) looks “indistinguishable” from x,
The goal of standard adversarial attacks is, for a given x ∈ X,y ∈Y, to ?nd a perturbation π ∈ Π that maximizes the loss (描述语言、形式)
A slight variant of this approach is to最大化原目标损失、最小化目标损失的联合
To design a threat model for a physical camera attack, 我们需要考虑在贴纸上放置小点的近似效果。由于相机透镜的光学原理,在相机镜头上放置一个不透明的小点,就会在图像本身形成一个小的半透明块。假设有足够的照明,这种半透明的叠加可以通过混合原始图像和适当大小和颜色的点之间的 alpha-blending operation 来实现。

More formally, explicitly considering x to be an 2D image 更正式的,对于图像当中的单个点引入扰动函数Π:(i,j)表示图像中像素位置,
Intuitively, this perturbation model captures the following process.
每一个扰动点通过其中心坐标和颜色值来表示,扰动图像中的像素为原始像素和颜色的线性组合,权重通过位置独立的alpha mask决定。与点越近alpha越大,beta为平滑下降。

A visualization of a possible perturbation under this model, with multiple dots with different center locations and colors, αmax = 0.3, β = 1.0, and a radius of 40 pixels is shown in Figure 2. 具体参数值描述很清楚,但没必要说这些值是如何获取的。

3.2.Achieving inconspicuous, physically realizable perturbations

尽管如上方法可以实现模型的攻击,但是存在的问题,First,过于明显, But second, and more subtly,很多扰动对于摄像头物理不可见,如上图2的白色点。

In more details, our process works like the following. 如下图a所示原始干净图像记为x0,加点扰动后的图像c/d/e/f记为x1,为了训练扰动模型能够重建扰动,使用 structural similarity (SSIM) 来测量两张图像之间的相似度
In theory, because both the SSIM and the perturbation model π are differentiable functions, we could simply use projected gradient descent (PGD) to optimize our perturbation model。

We performed this procedure for 50 different physical dots, to learn a single set of αmax, β, and r parameters, and 50 different colors (and of course, 50 different locations, though these parameters were only ?t here in order to ?t the remaining parameters well). 具体的实验调参细节。

3.3.Constructing adversarial examples
Given this perturbation model, our ?nal goal is to



Here we present experiments of our attack evaluating both the ability of the digital version of the attack to misclassify images from the ImageNet dataset (still restricting perturbations to be in our physically realizable subset), and evaluating the system on two real-world tasks: classifying a computer keyboard as a computer mouse, and classifying a stop sign as a guitar pick. We also detail some key results in the process of ?tting the threat model to real data. 总引

4.1.Experimental setup

All our experiments consider fooling a ResNet-50 (He etal.,2016) classifier, pretrained using the ImageNet dataset (Dengetal.,2009); we specifically use the pretrained model included in the PyTorch library (Paszke et al., 2017). …

4.2.Training and Classi?cation on ImageNet (关于本文的小标题也都起的比较详尽)

To train and evaluate the system on a broad range of images, we use

Figure 4 shows the learned perturbations for two instances. 直接引出图像并说明分析

Table 1 shows the ability of our learned perturbations (6 dots) to fool images from the ImageNet test set for these two categories. We also showed the average success rate to fool stop sign into 50 random classes. More generally, we also include … 语言描述
Unrealizable attacks :

4.3. Evaluation of attacks in the real world

In this section we present our main empirical result of the work, illustrating that the perturbations we produce can be adversarial in the real world when printed and applied physically to a camera, and when viewing a target object at multiple angles and scales.
Figure 6 and 7 shows several snapshots of the process for both the keyboard and stop sign tests. 直接描述及引出其优点:不可见性;

4.4.Other experiments

Finally, because there are several aspects of interest regarding both the power of the thread model we consider and our ability to physically manufacture such dots, we here present additional evaluations that highlight aspects of the setup.

Effect of the number of dots 点数量的影响

Effect of printed dots on camera perturbations 打印点的影响

