SGML Documents and their Manipulation in Object and Object-Relational Databases

Eric Neuhold (joint work with Karl Aberer)
GMD-IPSI, Darmstadt

Over the past 5 years at GMD-IPSI the problem of managing SGML/HyTime compliant documents has been investigated and prototypes for managing SGML/Hytime documents within object-oriented and object-relational database management systems have been developed. In the course of this work a number of problems have been addressed that are related to generic data management issues for documents as well as to specific techniques required to support the SGML/HyTime standard.

With regard to data management we investigated the object modelling of documents, the use of different degrees of document fragmentation at the physical storage level and the use of indexing techniques and corresponding query optimization techniques. With regard to the SGML/HyTime standard we looked at technical issues arising from incomplete markup as well as inclusions/exclusions, and the maintenance of document consistency, both with regard to the DTD definitions and with regard to the additional semantic constraints specified within the HyTime standard. In the light of todays developments in the context of semi-structured data management and around the evolving XML standard family, many of these results experience renewed interest.

Ref: Klemens Boehm, Karl Aberer, Erich J. Neuhold, Xiaoya Yang: Structered Document Storage and Refined Declarative and Navigational Access Mechanisms in Hyperstorm. In: VLDB Journal, Vol. 6 (1997), Springer Verlag.