next up previous
Next: References Up: Change Detection in Previous: Quality of FastMatch's

Summary and Future Work


We have motivated the problem of computing and representing changes in hierarchically structured data. Our formal definition of the change detection problem for hierarchically structured data uses the idea of a matching and a minimum cost edit script that transforms one tree to another. We have split the change detection problem into two subproblems: the Good Matching and the Minimum Conforming Edit Script problems. We have presented algorithms for these problems, and we have studied our algorithms both analytically and empirically. Finally, as an application of some of these ideas, we have implemented a program for computing and representing changes in structured documents. More details about the implementation, including a sample ``run,'' can be found in [CRGMW95].

We are working on generalizing our algorithms to detect changes in data that can be represented as graphs but not necessarily trees. We are also investigating other matching criteria to improve the performance of our algorithms, especially for non-document domains. We plan to further investigate the tradeoff between optimality and efficiency to produce a parameterized algorithm where the parameter k specifies the desired level of optimality. We are also improving the implementation of our LaDiff program, and extending it to HTML and SGML documents. We also plan to incorporate the diff program in a web browser so that users can monitor web pages of interest and track their changes over time.

= .9

Sudarshan S. Chawathe
Wed Jun 19 08:22:34 PDT 1996