Finding an Edge Cover

Next: Generating the Edit Script Up: Method Overview Previous: Pruning the Induced Graph

Finding an Edge Cover

By applying the pruning rules (Section 5.3) to the induced graph of our running example, say we obtain the pruned induced graph depicted in Figure 3 (ignore for the present the difference between dotted and solid lines in the figure). Although the pruned induced graph typically has far fewer edges than the original induced graph does, it may still contain more edges than needed to form an edge cover. In Section 4.2 we will see that we need only consider edge covers that are minimal; that is, edge covers that are not proper supersets of any edge cover. In other words, we would like to remove from the pruned induced graph those edges that are not needed to cover nodes. For example, in the pruned induced graph shown in Figure 3, having all four of the edges [7,61] , [7,63] , [9,61] , and [9, 63] is unnecessary; we may remove either [7, 63] and [9, 61] ; or [7, 61] and [9, 63] . However, it is not possible to decide a priori which of these options is the better one; that is, it is not obvious which choice would lead to an edit script of lower cost. With pruning, on the other hand, there was no doubt that certain edges could be removed.

One way to decide among these options is to enumerate all possible minimal edge covers of the pruned induced graph, find the edit script corresponding to each one (using the method described later in Section 5), and to pick the one with the least cost. However, given the exponentially large number of edge covers, this is obviously not an efficient algorithm. To compute an optimal edge cover efficiently, we need to be able to determine how much each edge in the edge cover contributes to the total cost of an edit script corresponding to an edge cover containing it. That is, we need to distribute the cost of the edit script corresponding to an edge cover over the individual edges of the edge cover. Once we have a cost defined for each edge in the pruned induced graph, we can find a minimum-cost edge cover using standard techniques based on reducing the edge cover problem to a weighted matching problem [&make_named_href('', "node21.html#combopt","[PS82]"), &make_named_href('', "node21.html#lawler","[Law76]")]. For example, if the edges [7,61] , [7,63] , [9,61] , and [9, 63] , have costs 0, 1.3, 0.2, and 2.4, respectively, then we generate an edge cover that includes [7,61] and [9,61] , and excludes [7,63] and [9,61] .

Note, however, that such a reduction of the edit script problem to an edge cover (and thus, weighted matching) problem cannot be exact, given the hardness of the edit script problem. Indeed, our method of assigning costs to edges of the induced graph (Section 5.1) is only approximate, and thus the minimum-cost edge cover is not guaranteed to produce the best solution for the edit script problem.

Next: Generating the Edit Script Up: Method Overview Previous: Pruning the Induced Graph

Sudarshan S. Chawathe
Sat Feb 22 12:28:02 PST 1997