As with language and defects, the relationship between
language class and defects is based on a small effect. The
deviance explained is similar, albeit smaller, with language
class explaining much less than 1% of the deviance.
We now revisit the question of application domain. Does
domain have an interaction with language class? Does the
choice of, for example, a functional language, have an advantage for a particular domain? As above, a Chi-square test for
the relationship between these factors and the project domain
yields a value of 99.05 and df = 30 with p = 2.622e−09 allowing us to reject the null hypothesis that the factors are independent. Cramer’s-V yields a value of 0.133, a weak level of
association. Consequently, although there is some relation
between domain and language, there is only a weak relationship between domain and language class.
Result 2: There is a small but significant relationship between
language class and defects. Functional languages are associated
with fewer defects than either procedural or scripting languages.
It is somewhat unsatisfying that we do not observe a strong
association between language, or language class, and domain
within a project. An alternative way to view this same data
is to disregard projects and aggregate defects over all languages and domains. Since this does not yield independent
samples, we do not attempt to analyze it statistically, rather
we take a descriptive, visualization-based approach.
We define Defect Proneness as the ratio of bug fix commits
over total commits per language per domain. Figure 1 illustrates
the interaction between domain and language using a heat
map, where the defect proneness increases from lighter to
darker zone. We investigate which language factors influence defect fixing commits across a collection of projects
written across a variety of languages. This leads to the following research question:
RQ3. Does language defect proneness depend on domain?
In order to answer this question we first filtered out proj-
ects that would have been viewed as outliers, filtered as high
leverage points, in our regression models. This was necessary
here as, even though this is a nonstatistical method, some
relationships could impact visualization. For example, we
ect, was responsible for all of the errors in Middleware. This
for Middleware. This pattern repeats in other domains, con-
sequently, we filter out the projects that have defect density
below 10 and above 90 percentile. The result is in Figure 1.
We see only a subdued variation in this heat map which
is a result of the inherent defect proneness of the languages
as seen in RQ1. To validate this, we measure the pairwise
rank correlation between the language defect proneness
for each domain with the overall. For all of the domains
except Database, the correlation is positive, and p-values are
significant (<0.01). Thus, w.r.t. defect proneness, the language ordering in each domain is strongly correlated with
the overall language ordering.
Figure 1. Interaction of language’s defect proneness with domain.
Each cell in the heat map represents defect proneness of a language
(row header) for a given domain (column header). The “Overall”
column represents defect proneness of a language over all the
domains. The cells with white cross mark indicate null value, that is,
no commits were made corresponding to that cell.
ApplicationCodeAnalyzer Database Framework Library Middleware Overall
APP CA DB FW LIB MW
Spearman corr. 0.71 0.56 0.30 0.76 0.90 0.46
p-Value 0.00 0.02 0.28 0.00 0.00 0.09
Result 3: There is no general relationship between application domain and language defect proneness.
We have shown that different languages induce a larger
number of defects and that this relationship is not only
related to particular languages but holds for general classes
of languages; however, we find that the type of project does
not mediate this relationship to a large degree. We now turn
our attention to categorization of the response. We want to
understand how language relates to specific kinds of defects
and how this relationship compares to the more general rela-
tionship that we observe. We divide the defects into catego-
ries as described in Table 5 and ask the following question:
We use an approach similar to RQ3 to understand the rela-
tion between languages and bug categories. First, we study
the relation between bug categories and language class.
A heat map (Figure 2) shows aggregated defects over language
classes and bug types. To understand the interaction between
Figure 2. Relation between bug categories and language class.
Each cell represents percentage of bug fix commit out of all bug fix
commits per language class (row header) per bug category (column
header). The values are normalized column wise.
func-dynamic- implic .managed
func-static- implic .managed La
proc-static- implic .managed
proc-static- expl. -unmanaged
script-dynamic- expl. -managed
script-dynamic- implic .managed
Concurrency Failure Memory