Bridging the Gap between RDF and XML

"It is a goal to facilitate the use of RDF mechanisms to access the information contained in a broad range of  XML documents, including those that were not initially structured according to the RDF 1.0 layering."

--- The Cambridge Communiqué [CC99]

Introduction

The convoluted syntax of the RDF 1.0 specification [RDF99] is a major obstacle for the broad acceptance of RDF. The goal of this proposal is to allow every "legacy" XML document to have an RDF model. The advantages of this approach include:
  1. The semantics of XML documents can be made explicit. Both structural and semantic markup can coexist in the same document.
  2. RDF can be used to annotate existing XML documents.
  3. "RDF-enabled" XML can still be rendered and transformed using XSLT [XSLT99].
  4. Using small changes in XML DTDs, meaningful RDF documents can be produced from original XML documents. But every XML document (even those without DTDs) has a default RDF interpretation.
This proposal builds upon a simplified syntax for RDF discussed in [BL99] and [SM99]. It uses advanced digest-based algorithms to generate stable identifiers for anonymous resources (not explicitly named pieces of information).

Example

XML documents typically contain both structural and semantic markup. A typical document of such kind is shown below:

<?xml version="1.0">

<document>

  <title>Bridging the Gap between RDF and XML</title>
  <author>Sergey Melnik</author>

  <abstract>A proposal to provide RDF interpretation for XML</abstract>

  <section caption="Introduction">

     <p>The goal of this proposal is to facilitate the use of RDF mechanisms
        to access the information contained in a broad range of XML documents.</p>

     <p>It builds upon a <a href="syntax.html">simplified syntax</a> for RDF,
        but has a <em>broader</em> scope.</p>

  </section>

</document>

The following figure illustrates a possible semantic interpretation for this XML document:

An RDF parser needs some hints in order to determine whether a given XML element identifies a name of a relationship, or rather a class name. Every XML tag is regarded as a relationship name, unless an RDF property rdf:instance is used to override this default. To illustrate, we could add rdf:instance to the section tag in the above example to make a section to be an instance of the class "section". These hints can be stored in the DTD of the document without modifying the document content (see example).

The RDF interpretation of a document is controlled by the RDF tags rdf:instance, rdf:for and rdf:resource (compare [SM99]).

Instead of discussing the details of the XML-to-RDF mapping, the reader is encouraged to download the parser together with an RDF API and to experiment with it.

Download

Download an experimental parser for the new RDF syntax. It is packaged together with an RDF API.

References

CC99 The Cambridge Communiqué, W3C Note, Oct 1999
http://www.w3.org/TR/1999/NOTE-schema-arch-19991007
SM99 Sergey Melnik. Simplified Syntax for RDF, Nov 1999
http://infolab.stanford.edu/~melnik/rdf/syntax.html
BL99 Tim Berners-Lee. A Strawman Unstriped Syntax for RDF in XML, 1999 
http://www.w3.org/DesignIssues/Syntax
RDF99 Resource Description Framework (RDF) Model and Syntax Specification, Feb 1999
http://www.w3.org/TR/REC-rdf-syntax/
XSLT99 XSL Transformations (XSLT), Nov 1999
http://www.w3.org/TR/xslt



Sergey Melnik, Dec 16, 1999. Last change: Dec 16, 1999