tion of a block with just seven people is
an insignificant risk for the country as
a whole, this attack can be performed
for virtually every block in the United
States using the data provided in the
2010 census. The final section of this
article discusses the implications of
this for the 2020 decennial census.
An Example Database
Reconstruction Attack
To present the attack, let’s consider the
census of a fictional geographic frame
(for example, a suburban block), conducted by the fictional statistical agency. For every block, the agency collects
each resident’s age, sex, and race, and
publishes a variety of statistics. To simplify the example, this fictional world
has only two races—black or African
American, and white—and two sexes—
female and male.
The statistical agency is prohibited
from publishing the raw microdata
and instead publishes a tabular report.
Table 1 shows fictional statistical data
for a fictional block published by the
fictional statistics agency. The “
statistic” column is for identification purposes only.
Notice that a substantial amount
of information in Table 1 has been
suppressed—marked with a (D). In
this case, the statistical agency’s disclosure-avoidance rules prohibit it
from publishing statistics based on
one or two people. This suppression
rule is sometimes called “the rule of
three,” because cells in the report
ential privacy. They provided a mathematical definition of the privacy loss
that persons suffer as a result of a data
publication, and they proposed a mechanism for determining how much noise
must be added for any given level of privacy protection (the authors received
the Test of Time award at the Theory of
Cryptography Conference in 2016 and
the Gödel Prize in 2017).
The 2020 census is expected to
count approximately 330 million peo-
ple living on about 8. 5 million blocks,
with some inhabited blocks having as
few as a single person and other
blocks having thousands. With this
level of scale and diversity, it is diffi-
cult to visualize how such a data re-
lease might be susceptible to database
reconstruction. We now know, howev-
er, that reconstruction would in fact
pose a significant threat to the confi-
dentiality of the 2020 microdata that
underlies unprotected statistical ta-
bles if privacy-protecting measures
are not implemented. To help under-
stand the importance of adopting for-
mal privacy methods, this article pres-
ents a database reconstruction of a
much smaller statistical publication:
a hypothetical block containing seven
people distributed over two house-
holds. (The 2010 U.S. Census con-
tained 1,539,183 census blocks in the
50 states and the District of Columbia
with between one and seven residents.
The data can be downloaded from
https://bit.ly/2L0Mk51)
Even a relatively small number of
constraints results in an exact solution
for the blocks’ inhabitants. Differential privacy can protect the published
data by creating uncertainty. Although
readers may think that the reconstruc-
Table 3. Variables associated with
the reconstruction attack.
Person Age Sex Race
Marital
Status
1 A1 S1 R1 M1
2 A2 S2 R2 M2
3 A3 S3 R3 M3
4 A4 S4 R4 M4
5 A5 S5 R5 M5
6 A6 S6 R6 M6
7 A7 S7 R7 M7
Key
Female 0
Male 1
Black or
African
American
0
White 1
Single 0
Married 1
Table 1. Fictional statistical data for a fictional block.
Age
Statistic Group Count Median Mean
1A Total Population 7 30 38
2A Female 4 30 33. 5
2B Male 3 30 44
2C Black or African American 4 51 48.5
2D White 3 24 24
3A Single Adults (D) (D) (D)
3B Married Adults 4 51 54
4A Black or African American Female 3 36 36. 7
4B Black or African American Male (D) (D) (D)
4C White Male (D) (D) (D)
4D White Female (D) (D) (D)
5A Persons Under 5 Years (D) (D) (D)
5B Persons Under 18 Years (D) (D) (D)
5C Persons 64 Years or Over (D) (D) (D)
Note: Married persons must be 15 or over
Table 2. Possible ages for a median of 30 and a mean of 44.
ABC ABC ABC
1
30
101
11
30
91
21
30
81
2 30 100 12 30 90 22 30 80
3 30 99 13 30 89 23 30 79
4 30 98 14 30 88 24 30 78
5 30 97 15 30 87 25 30 77
6 30 96 16 30 86 26 30 76
7 30 95 17 30 85 27 30 75
8 30 94 18 30 84 28 30 74
9 30 93 19 30 83 29 30 73
10 30 92 20 30 82 30 30 72