depend on the underlying programming language. ( 2) Generic
bug: more generic in nature and has less to do with project
function, for example, typeerrors, concurrency errors, etc.
Consequently, it is reasonable to think that the interaction of
application domain and language might impact the number
of defects within a project. Since some languages are believed
to excel at some tasks more so than others, for example, C for
low level work, or Java for user applications, making an inappropriate choice might lead to a greater number of defects. To
study this we should ideally ignore the domain specific bugs,
as generic bugs are more likely to depend on the programming
language featured. However, since a domain-specific bugs may
also arise due to a generic programming error, it is difficult to
separate the two. A possible workaround is to study languages
while controlling the domain. Statistically, however, with 17
languages across 7 domains, the large number of terms would
be challenging to interpret given the sample size.
Given this, we first consider testing for the dependence
between domain and language usage within a project, using
a Chi-square test of independence. Of 119 cells, 46, that is,
39%, are below the value of 5 which is too high. No more than
20% of the counts should be below 5.
14 We include the value
here for completenessd; however, the low strength of association of 0.191 as measured by Cramer’s V, suggests that any
relationship between domain and language is small and that
inclusion of domain in regression models would not produce
One option to address this concern would be to remove
languages or combine domains, however, our data here presents
no clear choices. Alternatively, we could combine languages;
this choice leads to a related but slightly different question.
RQ2. Which language properties relate to defects?
Rather than considering languages individually, we aggregate them by language class, as described in Section 2. 2, and
analyze the relationship to defects. Broadly, each of these
properties divides languages along lines that are often discussed in the context of errors, drives user debate, or has
been the subject of prior work. Since the individual properties are highly correlated, we create six model factors that
combine all of the individual factors across all of the languages
in our study. We then model the impact of the six different
factors on the number of defects while controlling for the
same basic covariates that we used in the model in RQ1.
As with language (earlier in Table 6), we are comparing
language classes with the average behavior across all language classes. The model is presented in Table 7. It is clear
that Script-Dynamic-Explicit-Managed class has
the smallest magnitude coefficient. The coefficient is insignificant, that is, the z-test for the coefficient cannot distinguish the coefficient from zero. Given the magnitude of the
standard error, however, we can assume that the behavior
of languages in this class is very close to the average across
all languages. We confirm this by recoding the coefficient
using Proc-Static-Implicit-Unmanaged as the base
level and employing treatment, or dummy coding that compares each language class with the base level. In this case,
Script-Dynamic-Explicit-Managed is significantly
different with p = 0.00044. We note here that while choos-
ing different coding methods affects the coefficients and
z-scores, the models are identical in all other respects. When
we change the coding we are rescaling the coefficients to
reflect the comparison that we wish to make.
4 Comparing the
other language classes to the grand mean, Proc-Static-
Implicit-Unmanaged languages are more likely to induce
defects. This implies that either implicit type conversion or
memory management issues contribute to greater defect
proneness as compared with other procedural languages.
Among scripting languages we observe a similar relation-
ship between languages that allow versus those that do not
allow implicit type conversion, providing some evidence that
implicit type conversion (vs. explicit) is responsible for this dif-
ference as opposed to memory management. We cannot state
this conclusively given the correlation between factors. However
when compared to the average, as a group, languages that do
not allow implicit type conversion are less error-prone while
those that do are more error-prone. The contrast between static
and dynamic typing is also visible in functional languages.
The functional languages as a group show a strong difference from the average. Statically typed languages have a
substantially smaller coefficient yet both functional language
classes have the same standard error. This is strong evidence
that functional static languages are less error-prone than
functional dynamic languages, however, the z-tests only test
whether the coefficients are different from zero. In order to
strengthen this assertion, we recode the model as above using
treatment coding and observe that the Functional-Static-Explicit-Managed language class is significantly less
defect-prone than the Functional-Dynamic-Explicit-Managed language class with p = 0.034.
(Intercept) − 2. 13 (0.10)***
Log commits 0.96 (0.01)***
Log age 0.07 (0.01)***
Log size 0.05 (0.01)***
Log devs 0.07 (0.01)***
Functional-Static-Explicit-Managed −0.25 (0.04)***
Functional-Dynamic-Explicit-Managed −0.17 (0.04)***
Proc-Static-Explicit-Managed −0.06 (0.03)*
Script-Dynamic-Explicit-Managed 0.001 (0.03)
Script-Dynamic-Implicit-Managed 0.04 (0.02)*
Proc-Static-Implicit-Unmanaged 0.14 (0.02)***
Language classes coded with weighted effects codes (AIC = 10,419, Deviance = 1132,
Num. obs. = 1067).
***p < 0.001, p < 0.01, p < 0.05.
Table 7. Functional languages have a smaller relationship to defects
than other language classes whereas procedural languages are
greater than or similar to the average.
d Chi-squared value of 243.6 with 96 df. and p = 8.394e− 15
Df Deviance Resid. Df
NULL 1066 32,995.23
Logcommits 1 31,634.32 1065 1360.91 0
Log age 1 51.04 1064 1309.87 0
Logsize 1 50. 82 1063 1259.05 0
Logdevs 1 31. 11 1062 1227.94 0
Lang.class 5 95. 54 1057 1132.40 0