Therefore, this limitation did not invalidate our results.
Our analysis included 600 conferences consisting of 14,017 full papers
and 1,508 issues of journals consisting of 10,277 articles published from
1970 to 2005. Their citation counts
were based on our full data set consisting of 4, 119,899 listed references
from 790,726 paper records, of which
1,536,923 references were resolved
within the data set itself and can be
used toward citation count. Overall,
the conference papers had an average
two-year citation count of 2. 15 and the
journal papers an average two-year citation count of 1. 53. These counts follow
a highly skewed distribution (see Figure 1), with over 70% of papers receiving no more than two citations. Note
that while the average two-year citation
count for conferences was higher than
journals, the average four-year citation
count for articles published before
2003 was 3. 16 for conferences vs. 4.03
for journals; that is, on average, journals come out a little ahead of conference proceedings over the longer term.
figure 1. citation count distribution within two years of publication.
fraction of Papers
8 9 10 11 12 13 14 15 16 17 18 19 20
number of citations
figure 2. average citation count vs. acceptance rate for acm conferences.
We addressed the first question—on how
a conference’s acceptance rate correlates
with the impact of its papers—by correlating citation count with acceptance
rate; Figure 2 shows a scatterplot of average citation counts of ACM conferences
(y-axis) by their acceptance rates (x-axis).
Citation count differs substantially from
the spectrum of acceptance rates, with
a clear trend toward more citations for
low acceptance rates; we observed a statistically significant correlation between
the two values (each paper treated as
a sample, F[ 1, 14015] = 970.5, p<.001c)
0.852, showing that overall ACM citation count
is proportional to Google scholar citation count
with a small variation. When added as an additional parameter to the regression, acceptance
rate had a nonsignificant coefficient, showing
that acceptance rate does not have a significant
effect on the difference between ACM citation
count and Google scholar citation count. We
also hand-checked 50 randomly selected conference papers receiving no citations in our
data set, finding no correlation between acceptance rate and Google scholar citation count.
c This F-statistic shows how well a linear relationship between acceptance rate and citation count explains the variance within citation count. The notation F[ 1, 14015] = 970.5,
p<.001 signifies one degree of freedom for
model (from using only acceptance rate to ex-
and computed both a linear regression
line (each conference weighted by its
size, adjusted R-square: 0.258, weighted
residual sum-of-squares: 35311) and a
nonlinear regression curve in the form
of y=a+bx−c (each conference weighted
by its size, pseudo R-square: 0.325,
weighted residual sum-of-squares:
32222), as shown in Figure 2.
Figure 3 is an aggregate view of the
data, where we grouped conferences
into bins according to acceptance
rates and computed the average ci-
tation counts of each bin.d Citation
counts for journal articles are shown
as a dashed line for comparison.
Conferences with rates less than 20%
enjoyed an average citation count as
high as 3. 5. Less-selective conferences
yielded fewer citations per paper, with
the least-selective conferences (>55%
acceptance rate) averaging less than ½
citation per paper.
plain citation counts), 14,015 degrees of freedom for error (from the more than 14,000 conference papers in our analysis), an F-statistic
of 970.5, and probability less than 0.001 that
the correlation between acceptance rate and
citation count is the result of random chance.
d We excluded conferences with an acceptance
rate less than 10% and an acceptance rate
over 60%, as there were too few conferences in
these categories for meaningful analysis.