A popular approach to pixel labeling problems, such as multiclass image segmentation, is to construct a pairwise conditional Markov random field (CRF) over image pixels where the pairwise term encodes a preference for smoothness within local 4-connected or 8-connected pixel neighborhoods. Recently, researchers have considered higher-order models that encode soft non-local constraints (e.g., label consistency, connectedness, or co-occurrence statistics). These new models and the associated energy minimization algorithms have significantly pushed the state-of-the-art for pixel labeling problems. In this paper, we consider a new non-local constraint that penalizes inconsistent pixel labels between disjoint image regions having similar appearance. We encode this constraint as a truncated higher-order matching potential function between pairs of image regions in a conditional Markov random field model and show how to perform efficient approximate MAP inference in the model. We experimentally demonstrate quantitative and qualitative improvements over a strong baseline pairwise conditional Markov random field model on two challenging multiclass pixel labeling datasets.
6 Figures and Tables
Figure 1. Schematic showing the mapping between two regions P and Q. We wish to penalize labelings in which the two regions disagree on corresponding pixel assignments, yp and yq .
Table 1. Pixelwise semantic labeling accuracy for 21-class MSRC  and 8-class Stanford Background  datasets. Compares baseline unary and pairwise CRF model against model with non-truncated and truncated higher-order matching potentials.
Figure 2. Truncated higher-order matching potential. The potential penalizes disagreement between labels, yp and yq , of corresponding pixels within matched regions, P and Q, up to some maximum penalty, Mmax.
Figure 4. Illustration of weights ((d) and (g)) assigned to pixels within matched regions ((b) and (c), and (e) and (f), respectively). Panels (d) and (g) are colored with red indicating a higher weight.
Figure 5. Plot showing percentage agreement between corresponding pixels in matched regions with respect to (i) ground-truth labels, (ii) output from our model with higher-order matching potentials, (iii) baseline CRF output, and (iv) unary model output.
Figure 7. Best viewed in color. Example results from our multiclass pixel labeling experiments on the 8-class Stanford Background dataset . See Figure 6 for description of panels.
Download Full PDF Version (Non-Commercial Use)