How does LiShield disrupt camera image capturing?
Cameras and human eyes perceive scenes in fundamentally
different ways. Human eyes process continuous vision by
accumulating light signals, whereas cameras slice and sample the scene at discrete intervals. Consequently, human eyes
are not sensitive to high frequency flickers beyond around 80
Hz either in brightness or chromaticity, 7 whereas cameras
can easily pick up flicker above a few kHz. 20 Equally importantly, human eyes perceive brightness in a nonlinear fashion, which gives them huge dynamic range, whereas cameras
easily suffer from overexposure and underexposure when
signals with disparate intensities mix in the same scene.
Unlike professional or industrial cameras which may
have global shutters that mimic human eyes to some degree,
nearly all consumer digital cameras, pinhole cameras, and
smartphones use the rolling shutter sampling mechanism, 8
which is the main contributor to their high-frequency sensitivity. When capturing an image frame, a rolling shutter camera exposes each row sequentially.
LiShield harnesses the disparity between cameras and
eyes to disrupt the camera imaging without affecting human
vision. It modulates a smart LED to generate high-frequency
flickering patterns. The reflection intensity (or brightness) of
target scene also flickers following the same pattern as the
LED’s illumination, albeit at reduced intensity due to reflection loss. LiShield uses the On-Off Keying (OOK) as the basic
modulation waveform (Figure 1), which does not require
complicated analog front-ends and is widely supported by
smart LEDs. Due to rolling-shutter sampling, the rows of pixels that are fully exposed in the ON period will be bright, and
rows in the OFF period become dark, thus causing striped
patterns on the captured image (Figure 1(a, b)). Partially
exposed rows experience moderate brightness. Meanwhile,
human eyes can only perceive the smooth averaged intensity
as long as the OOK frequency goes beyond 80 Hz. 7, 21
In addition, LiShield can turn different numbers of LED
bulb/chip on to generate different intensities and control
the RGB channels of the LEDs to vary the color. In Section 2,
we will show how such flickering corrupts the spatial patterns captured by a camera.
Summary of results. We have implemented LiShield based
on a customized smart LED, which allows reconfiguration
of intensity modulation waveforms on each color channel.
Our experiments on real world scenes demonstrate that
LiShield can corrupt the camera capturing to an illegible
level, in terms of the image brightness, structure, and color.
The impact is resilient against possible attacks, such as multiframe combining and denoising. On the other hand, it enables
authorized cameras to recover the image perfectly, as if no
modulation is present. Even under strong sunlight/flashlight
interferences, LiShield can still sneak barcode into the physical scenes, which can be decoded with around 95% accuracy.
2. DISRUPTING CAMERA CAPTURING USING SMART
LIGHTING
2. 1. Maximizing image quality degradation
LiShield aims to minimize the image capturing quality by
optimizing the LED waveform, characterized by modulation
frequency, intensity, and duty cycle. To this end, we derive
a model that can predict the image quality as a function of
the LiShield’s waveform and attacker’s camera parameters.
For simplicity, we start with monochrome LED with a single
color channel that illuminates the space homogeneously. We
denote P as the reference image taken under a nonflickering
LED and Q as the one taken under LiShield’s LED with the
same average brightness. We assume each image has m rows
and n columns, and the light energy received by each pixel is
denoted by P(i, j) and Q(i, j), respectively. Our model focuses
on two widely adopted image quality metrics: PSNR, which
quantifies the disruption on individual pixel intensity levels,
and SSIM, 18 which measures the structural distortion to the
image (i.e., deformation effects such as stretching, banding,
and twisting). In general, the minimum PSNR and SSIM corresponding to acceptable viewing quality are in the range of
25–30 and 0.8–0.9, respectively. 1
Decomposing the image. To compute the image qual-
ity, we need to model the intensity and width of each stripe
caused by LiShield. As illustrated in Figure 1, we use ton, toff,
and Ip to denote the on/off duration and peak intensity of the
flickering light source, and te and ts are the exposure time
and sampling interval of the rolling shutter camera. For con-
venience, denote the period of the light source as tl = ton + toff
and duty cycle as Dc = ton/tl. For pixel j in row i which starts
exposure at time ti, its light accumulation would be:
( 1)
where αi,j is the aggregated path-loss for pixel (i, j), such
as attenuation and reflection on the photographed object,
and πl(τ) represents the illumination waveform of the LED:
( 2)
When the camera’s exposure time is equal to or shorter
than the LED’s OFF period (te toff), the image will contain
rows that are completely dark (Figure 1(c)). On the other
hand, when te > tl, one row-exposure period of the camera
will overlap multiple ON periods of the LED, accumulating
higher intensity (Figure 1(f) ). The special case happens when
te = tl, where the integration of LED waveform and exposure
has fixed value, which eventually smooths out dark stripes
(Figure 1(e)). Without loss of generality, assume that the
LED Waveform Exposure
Dark
Bright
Dark
Bright
Dark
Dark
LED is ON
Stripe Waveform
Transitional
Stripe
Image
Brightness
R
ow
In
d
e
x
te toff
ton
Figure 1. (a) and (b) Bright, dark and transitional stripes and their
width changing with exposure time; (c)–(f) stripe pattern of image
changes under different exposure times.