Next: Cost Model
Up: Model and Problem Definition
Previous: Model and Problem Definition
In the following, we will assume that an edit operation e is applied
to , and produces the tree .
We write this as .
We consider the following six edit operations:
- Insertion: Intuitively, an insertion operation creates a new
tree node with a given label, and places it at a given position in
the tree. The position of the new node n in the tree is specified
by giving its parent node p and a subset C of the children of
p. The result of this operation is that n is a child of p,
and the nodes C, that were originally children of p, are now
children of the newly inserted node n.
Formally, an insertion operation is denoted by
,
where n is the (unique) identifier of the new node, v is the
label of the new node, is the node that is to be the
parent of n, and is the set of nodes that are
to be the children of n. When applied to ,
we get a tree , where ,
, , , , and . Due to space constraints, we describe the
remaining edit operations only informally below; the formal
definitions are in [&make_named_href('',
"node21.html#bbdiffExtended","[CGM97]")].
- Deletion: This operation is the inverse of the insertion
operation. Intuitively, causes n to disappear
from the tree; the children of n are now the children of the
(old) parent of n. The root of the tree cannot be deleted.
- Update: The operation changes the label of the
node n to v.
- Move: A move operation moves the subtree rooted
at n to another position in the tree. The new position is
specified by giving the new parent of the node, p. The root cannot
be moved.
- Copy: A copy operation copies the subtree rooted
at n to a another position. The new position is
specified by giving the node p that is to be the parent of the new
copy. The root cannot be copied.
- Glue: This operation is the inverse of a copy operation. Given
two nodes and such that the subtrees rooted at and
are isomorphic, causes the subtree rooted
at to disappear. (It is conceptually ``united'' with the
subtree rooted at .) The root cannot be glued. Although the
GLU operation may seem unusual, note that it is a natural choice
for an edit operation given the existence of the CPY operation. As
we will see in Example 2.1, inverting an edit
script containing a CPY operations results in an edit script with a
GLU operation. This symmetry in the structure of edit operations
is useful in the design of our algorithms.
In addition to the above tree edit operations, one may wish to
consider operations such as a subtree delete operation that
deletes all nodes in a given subtree. Similarly, one could define a
subtree merge operation that merges two or more subtrees. We
do not consider such more complex edit operations in this paper, but
note that some of these operations, (e.g., subtree deletes) may be
detected by post-processing the output of our algorithm.
We define an edit script to be a sequence of zero or more edit
operations that can be applied in the order in which they occur in the
sequence. That is, given a tree , a sequence of edit operations
is an edit script if there exist
trees such that . We say that the
edit script transforms to , and write .
Figure 1: Edit operations on labeled trees
Next: Cost Model
Up: Model and Problem Definition
Previous: Model and Problem Definition
Sudarshan S. Chawathe
Sat Feb 22 12:28:02 PST 1997