The results, averaged over 10 runs, are shown in Figure 3a
and b. The figures show how the log-likelihood of the data varies with time for the four inference algorithms used. We see that
MC-LWMC has the lowest negative log-likelihood of all algorithms by a large margin. It significantly dominates MC-WMC in
about 2 min of runtime and is substantially superior to both lifted
BP and MC-SAT (notice the log scale). This shows the advantages
of approximate PTP over lifted BP and ground inference.
Probabilistic theorem proving (PTP) combines theorem
proving and probabilistic inference. This paper proposed
an algorithm for PTP based on reducing it to lifted weighted
model counting, and showed both theoretically and empirically that it has significant advantages compared to previous
lifted probabilistic inference algorithms. An implementation of PTP is available in the Alchemy 2.0 system (available
Directions for future research include: extension of PTP to
infinite, non-Herbrand first-order logic; new lifted inference
rules; theoretical analysis of liftability; porting to PTP more
speedup techniques from logical and probabilistic inference;
lifted splitting heuristics; better handling of existentials;
variational PTP algorithms; better importance distributions;
approximate lifting; answering multiple queries simultaneously; applications; etc.
Figure 2. (a) Impact of increasing the amount of evidence on the
time complexity of FOVE and PTP in the link prediction domain.
The number of objects in the domain is 100. (b) Impact of increasing
the number of objects on the time complexity of FOVE and PTP in the
link prediction domain, with 20% of the atoms set as evidence.
10 20 30 40 50 60 70 80
Percentage of evidence objects
100 200 300 400 500
Number of objects
0 10 20 30 40 50 60
0 10 20 30 40 50 60
Figure 3. Negative log-likelihood of the data as a function of time
for lifted BP, MC-SAT, MC-WMC, and MC-LWMC on (a) the entity
resolution (Cora) and (b) the collective classification domains.
with 20% of the atoms set as observed. We see that FOVE is
unable to solve any problems after the number of objects is
increased beyond 100 because it runs out of memory. PTP,
on the other hand, solves all problems in less than 100s.
7. 2. Approximate inference
In this subsection, we compare the performance of MC-LWMC, MC-WMC, lifted belief propagation, 24 and MC-SAT19
on two domains: entity resolution (Cora) and collective classification. The Cora dataset contains 1295 citations to 132
different research papers. The inference task here is to detect
duplicate citations, authors, titles, and venues. The collective
classification dataset consists of about 3000 query atoms.
Since computing the exact posterior marginals is infeasible in these domains, we used the following evaluation
method. We partitioned the data into two equalsized sets:
evidence set and test set. We then computed the probability
of each ground atom in the test set given all atoms in the evidence set using the four inference algorithms. We measure
the error using negative log-likelihood of the data according
to the inference algorithms (the negative log-likelihood is a
sampling approximation of the K–L divergence to the data-generating distribution, shifted by its entropy).