The align phase of the edit script algorithm presents an interesting
problem. Suppose we detect that for , the children of **x**
and **y** are misaligned. In general, there is more than one sequence
of moves that will align the children. For instance, in
Figure 7 there are at least two ways to align the
children of nodes 1 and 11. The first consists of moving nodes 2 and
4 to the right of node 6, and the second consists of moving nodes 3,
5, and 6 to the left of node 2. Both yield the same final
configuration, but the first one is better since it involves
fewer moves.

**Figure 7:** A matching with misaligned nodes

To ensure that the edit script generated by the algorithm is of
minimum cost, we must find the shortest sequence of moves to align the
children of **x** and **y**. Our algorithm for finding the shortest
sequence of moves is based on the notion of a *longest common
subsequence*, described next.

Given a sequence , a sequence is a *
subsequence* of **S** if it can be obtained by deleting zero or more
elements from **S**. That is, where . Given two sequences and
, a *longest common subsequence (LCS)* of and ,
denoted by , is a sequence of pairs of elements such that (1) is a
subsequence of ; (2) is a subsequence of ;
(3) for , is true for some
predefined equality function *equal*; and (4) there is no sequence
that satisfies conditions 1, 2, and 3 and is longer than **S**.
The length of an LCS of and is denoted by .

We use an algorithm due to Myers [Mye86] that computes an LCS of two sequences in time , where and . We treat Myers' LCS algorithm as having three inputs: the two sequences and to be compared, and an equality function used to compare and for equality. That is, we treat it as the procedure .

The solution to the alignment problem is now straightforward. Compute
an LCS **S** of the matched children of nodes **x** and **y**, using the
equality function *equal* that is true if and only if
. Leave the children of **x** that are in **S** fixed, and
move the remaining matched children of **x** to the correct positions
relative to the already aligned children. In
Figure 7, the LCS is **3,5,6** (matching the sequence
**12,13,14**). The moves generated are and .
In [CRGMW95], we show that our LCS-based strategy always leads
to the minimum number of moves.

Wed Jun 19 08:22:34 PDT 1996