In about a week, our paper Unifying Local Consistency and MAX SAT Relaxations for Scalable Inference with Rounding Guarantees by Steve, me, and Lise will appear at AISTATS in San Diego. (Steve will be giving a talk on it Monday morning.)
The paper title is a mouthful, and it includes a pretty technical result, so here’s a weak attempt at explaining the intuition and take-away messages.
The paper is about MAP inference, or finding the most likely state in a probability distribution. Specifically, it’s about MAP inference in what we are calling logical Markov random fields (MRFs), which are MRFs whose potentials are defined by weighted logical clauses of a particular form. We show equivalences between two different approaches (and a third bonus one) for approximating MAP inference in these logical MRFs. These equivalences are theoretically interesting, but at least as importantly, they allow us to get the benefits of each approach, leading to fast MAP approximations that have constant-factor approximation guarantees.
What’s special about logical MRFs is that MAP inference is equivalent to finding the maximum weighted satisfying assignment to the logical variables, aka the MAX SAT problem. On the surface, this equivalence doesn’t seem exciting, because it equates one NP-hard problem to another. What makes it exciting is that there are some nice approximation algorithms for MAX SAT with quality guarantees, and these algorithms can therefore be applied to get quality guarantees for MAP. Unfortunately, the approximation method by the famous computer science duo, Michael Goemans and David Williamson, requires solving a linear program (LP) that scales not-so-well in practice, when using off-the-shelf linear programming algorithms.
Another approach for approximating MAP inference borrows from recent developments on doing fast inference in MRFs using local-consistency relaxations. At a very high level, these approaches relax the space of possible marginal probabilities (i.e., the marginal polytope) to a simpler space that only requires local consistency between marginal probabilities of variables and factors (i.e., the local marginal polytope). By solving the corresponding optimization over this simpler, relaxed set of constraints, many very fast message-passing algorithms have been discovered in the past few years. So it’s natural to try to use one of these local-consistency relaxation message-passing algorithms to do MAP inference in a logical MRF.
The main result we show in the paper is that these two seemingly different approaches are equivalent. This equivalence means that when we use these fast-in-practice local-consistency relaxation algorithms that pass messages to quickly find an approximate solution, we’re also able to get the solution to the linear-program subproblem of the MAX SAT approximation algorithm. Using the LP solution, we can perform a special rounding scheme that guarantees a constant-factor approximation.
The last bonus equivalence is that both of these approaches are also equivalent to the linear form of a hinge-loss Markov random field, which we’ve been studying over the past few years as a powerful class of probabilistic models with efficiently optimizable energy functions. The conversion from logic to hinge-loss MRFs (i.e., the principles behind probabilistic soft logic) had previously been motivated by connections to fuzzy logic, and now we have these other relationships to MAX SAT relaxation and local-consistency relaxation.
Behind the scenes, this last bonus piece is how we happened to find these equivalences. We initially were working with these hinge-loss MRFs and we had thought that these other approaches to doing inference in logically-defined MRFs seemed so different, that it’d be interesting to compare them. So we ran a few tests and discovered they were behaving similarly… very similarly; they were returning solutions with differences that were small enough to be numerical errors. This surprising behavior led to confusion, derivations, and finally understanding.
Finally, the super weird part: any (nondeterministic factored) discrete MRF can be converted to an equivalent logical MRF. This, in some sense, means the constant-factor quality guarantees that come from the equivalence between logical MRF inference and MAX SAT, also apply to any discrete MRF. But along the way of converting a discrete MRF to the restricted logical MRF form, the strength of this constant-factor guarantee must be weakened, especially since we know that MAP can’t be approximated to a constant factor unless P = NP. But it’s still an open question how this all fits together. Check out the paper, and maybe we can brainstorm about this in San Diego.