CS99I Meeting 08 Notes: XML

By Gio Wiederhold, Updated 28 Jan 2001.

Topics Covered briefly

B2B needs: automation

Earlier time: HTML: for document transmittal, varied presentation, hierarchically structured + links to other HTML, IMAGES, etc.; ordered
Tags provide metadata for presentation ( HTML intro). Problem: The nice-for-people presentation doesn't really define what is being represented. For business use we want web pages that can be processed automatically.

To the rescue: XML: for document processing, hierarchically structured + links, more; ordered (except for attributes)

Read more in XML intro.

Whereas the HTML tags are common to all HTML documents, the XML tags are domain dependent. Domains might be:

For each domain the allowable tags, and the structure in which they appera has to be defined. That is done in a Data Tag Definition (DTD). To indicate if alements are optional, or can be repeated they are labeled with characters used in Regular Expressions.
Regular expression syntax

Important for formulating

  1. Representation grammars
  2. queries (getting some subset of the representation) sequence: (a,b,c)
    alternatives: (x|y), in combination (x|y, b,c) {x,b,c or y, b, c}
    optional: q$ {q | nothing}
    any: r* {nothing | r | rr | rrr | rrr... }
    repeats: s+ { s | ss | sss | sss... }
Example:
(((S|s)ection|paragraph(s$) )*.)
matches all citations looking like
Section xx., section xx., paragraph xx., paragraphs xx.
By setting a marker for xx, those text can be retrieved for display ot processing. A regular language is capable, but not really user-friendly.

XSL

To look at an XML file it must be transformed, best to HTML. Examples are given in the XML description.
For instance an XML catlog with entries as <Product> <Name>Pencil </Name> <Quantity>12 </Quantity> <Price>1.50 </Price> <Weigtht>60 </Weight> <Color> yellow </Color> </Product>
ETC

Would be instructed through an XSL program to

  1. Put a heading in " ITEM (boxed), Quantity/box , Price per box , ...
  2. Start a new line whenever a <Product> tag appears
  3. put a dollar sign in front of price
  4. put gram behind </Weight>
      for a U.S. customer divide the number by 32? and put oz. behind </Weight> <
  5. Perhaps add a java routine to compute a total at the end, when the customer clicks [done]

References

Brief intro to XML.
0 Brief intro to RDS ADO [ASP, 25Feb 2000].
XSL information
See also the references.