In this Section, we present an overview of algorithm MH-DIFF for computing a minimum-cost edit script between two trees. We present our algorithm informally using a running example; the details are deferred to later sections.
Figure: The trees for the running example in
Section 3.
Consider the two trees depicted in Figure 2. We
would like to find a minimum-cost edit script that transforms tree
into tree
. The reader may observe that these trees are
isomorphic to the initial and final trees from
Example 2.1 in Section 2. Note,
however, that there is no correspondence between the node identifiers
of
and
in Figure 2. This is because in
Example 2.1 we applied a known edit script to a tree,
transforming it to another tree in the process, whereas in this
section, we are trying to find an edit script, given two trees with no
information on the relationship between their nodes. Therefore, our
first step consists of finding a correspondence between the nodes of the
two given trees.
For example, consider the node 8 in Figure 2. We
want to find the node in that corresponds to this node in
.
The dashed lines in Figure 2 represent some of the
possibilities. Intuitively, we can see that matching the node 8 to
the node 51 does not seem like a good idea, since not only do the
labels of the two nodes differ, but the two nodes also have very
different locations in their respective trees; node 8 is a leaf node,
while node 51 is the root node. Similarly, we may intuitively argue
that matching node 8 to node 62 seems promising, since they are both
leaf nodes and their labels match. However, note that matching a
nodes based simply on their labels ignores the structure of the trees,
and thus is not, in general, the best choice. We make this intuitive
notion of a correspondence between nodes more precise below.