84 COMMUNICATIONS OF THE ACM | FEBRUARY2020 | VOL. 63 | NO. 2
te is always fixed within each frame, no single te can circumvent all the frequency components.
However, we cannot choose and switch the flickering frequencies in an arbitrary manner, for three reasons.
(i) Multiple frequency values that share a common divisor
can satisfy te = Ntl under the same te (recall N can be an arbitrary integer). We need to ensure the common divisor is small
enough (i.e., least common multiplier of tl large enough),
so that overexposure occurs even for the smallest N. (ii)
Frequencies should be kept low to maximize image corruption, as evident in Optimizing the LED waveform section,
since camera’s analog gain decreases at high frequencies. 20
(iii) Switching between different frequencies may create an
additional level of modulation, which will spread the spectrum and generate unexpected low frequency components
that become perceivable by eyes.
To explore the design space under these constraints, suppose we switch among M frequencies f1, f2, …, fM (in ascending
order) at a switching rate fB. The whole pattern thus repeats
itself at rate fp = fB/M. To pack at least M different frequencies
in an image frame, we need fB > (M − 1)fr or, preferably, fB > M
fr, where fr is the frame rate, typically around 30 Hz (fps). To
maximize image corruption, we choose the smallest value
for f1 (i.e., f1 = fB) and empirically set fn = fB + (n − 1)∆f, n = 2,
3, …, M, where ∆f is frequency increment, set to ∆f ≠ fB to
lower the common divisor frequency.
The frequency scrambling can be considered as an
M-FSK modulation, thus creating side lobes around each
scrambling frequency, spacing fp apart (Interested readers can refer to the full version of this work for the theoretical underpinning. 22). These side lobes might appear at
low-frequency region and become perceptible by eyes. To
ensure no side lobe exists below the perceivable threshold fth ≈ 80 Hz, we need a small M and large fB and hence
higher flickering frequency components fn. Yet, increasing
the flickering frequencies may weaken LiShield’s protection. To find the optimal ∆f and showcase the effectiveness of the frequency scrambling, we repeat the numerical
simulation (Section 2. 1) to evaluate the attacker’s maximum image quality. Based on the simulation, we set
∆f = 50 Hz to maximize image disruption. The optimal ∆f
for other peak intensity settings can be obtained following
a similar procedure.
Illumination intensity randomization. If attackers repetitively capture a static scene for a sufficiently long duration,
they may eventually find at least one clean version for each
row across all frames, thus recovering the image. LiShield can
increase the number of frames needed for image recovery,
so that the attack becomes infeasible unless the camera can
stay perfectly still over a long period of time, during which the
attackers may have already been discovered by the owners of
the physical space. LiShield achieves the goal by employing
illumination intensity randomization, where it randomly
switches the magnitude of each ON period across multiple
predefined levels, which extends the attacker’s search space.
3. SCENE RECOVERY WITH AUTHORIZED CAMERAS
To allow authorized users to capture the scene while main-
taining protection against unauthorized attackers, we need
to impose additional constraints on the LED waveform.
LiShield’s solution leverages a secure side channel (e.g.,
WiFi4) between authorized users and the smart LED, which
conveys secret information such as frame timing and wave-
A naive solution is to stop flickering when authorized
users are recording. However, since attackers may be co-located with the authorized users, this enables them to capture one or more frames that have part of the clean scene,
which compromises privacy and security. Instead, we design
special waveforms for the LED to counteract such cases.
3. 1. Authorized video recording
To authorize a camera to capture a dynamic scene, each
individual frame within the video must be recoverable.
To achieve this, the authorized camera needs to convey its
exposure time setting tue to the smart LED via the secure side
channel and synchronize its clock (for controlling capturing
time) with the smart LED’s clock (for controlling the waveform), so the smart LED can send recoverable waveforms
precisely during the capture of the authorized camera. State-of-the-art time synchronization mechanisms (e.g., 4) can
already achieve µs of accuracy, sufficient to synchronize the
LiShield smart LED with camera at a resolution that is finer
than the rolling shutter period (typically tens of µs).
Recall that the camera can evade the striping effects if
te = Ntl. So to authorize the user with exposure tu e, LiShield
simply needs to set its flickering frequency fa = 1/tl = N/tu e
(N = 1, 2, . . .) and maintain its peak intensity within each
frame. In addition, the tu e and corresponding flickering frequency fa can be varied on a frame by frame basis, making
it impossible for an attacker to resolve the correct exposure
time by trial-and-error (Section 2. 2).
Meanwhile, when the authorized camera is not recording at its maximum possible rate, there will be an interval
(i.e., inter-frame gap) where the camera pauses capturing.
LiShield packs random flickering frequencies finter other than
fintra = fa into the interframe gap, so as to achieve the same
scrambling effect as described in “Frequency scrambling”
section, without compromising the authorized capturing, as
shown in Figure 2.
Frame 2 Time
fintra finter fintra
Figure 2. Enabling authorized users to capture dynamic scenes while
corrupting unauthorized users.