CS99I Meeting 2 Notes: HTML, XML, XSL

By Gio Wiederhold, 12 Jan 2000.

Topics Covered briefly

Representations

Advantages and Limits

Readability
Processability
Granularity
-- (structure: word, line, paragraph, chapter, book )
-- (object: value, name-value pair, item, person, group, community ) with alternatives (family vs dorm)

Convenience versus precision

Words: unique in context, ambiguous out of context
Context: explicit versus implicit (by sets of words:

  1. "miter .. bishop";
  2. "miter .. wood";
  3. "knave -- bishop")

processability -- web usage - retrieval.

Formats

Paper: arbitrarily structured/unstructured; physical order.
Books: somewhat structured/unstructured; layout order; metadata: ToC, index.
Tables: very structured. Exceptions awkward -- footnotes
Databases: very structured. Machine processable, queryable. Exceptions awkward.

    relational: tabular based, links by references, join operator; unordered. student|><|course-info
    object-oriented: tree-based, structural (and optional reference) links; ordered (often)
SGML: for document printing, hierarchically structured; ordered
HTML: for document transmittal, varied presentation, hierarchically structured + links; ordered
XML: for document processing, hierarchically structured + links, more; ordered (except for attributes)
Regular expression syntax

Important for formulating

  1. Representation grammars
  2. queries (getting some subset of the representation) sequence: (a,b,c)
    alternatives: (x|y), in combination (x|y, b,c) {x,b,c or y, b, c}
    optional: q$ {q | nothing}
    any: r* {nothing | r | rr | rrr | rrr... }
    repeats: s+ { s | ss | sss | sss... }
Example:
(((S|s)ection|paragraph(s$) )*.)
matches all citations looking like
Section xx., section xx., paragraph xx., paragraphs xx.
By setting a marker for xx, those text can be retrieved for display ot processing. A regular language is capable, but not really user-friendly.

Programming

Base Programming: Machines as interpreters
Scripts: Software as interpreters
Combinations: microcode, Bytecode
CGI, Java, Javabeans, etc.

  1. The program - either in machine language or an intermediate language, as Java bytecode.
  2. The data - in one of the representations discussed above (.., databases, XML, ...)

Role of Standards

Standards are a tool in competition. They can be set by a
  1. Governmental Agency: Prescriptive standards. >LI>By historical convenience: width of two Roman horses --> width of carts and wheels -->grooves in stones --> other carts --> mining carts --> rails for mining carts --> Standard gauge railroads.
  2. By a company that dominates the industry -- IBM in the 1960's and 1970's (cards. tapes, disks, SQL), Microsoft in the 1990's (Windows, Word, ...).
  3. By an industry consortium that tries to counteract a dominant company -- POSIX for UNIX.
  4. By an industry alliance that tries create a market -- OMG for object-oriented software.
  5. By a dominant customer -- DoD for Ada language.
  6. By a customer - supplier collaboration -- Wintel: Microsoft and Intel.

News about java Standards

New York Times: January 26, 2000

Microsoft Is Told to Abide by Sun on Java
By THE ASSOCIATED PRESS
SEATTLE, Jan. 25 -- A federal judge has ruled that Microsoft must conform to standards set by Sun Microsystems when it sells products that use Sun's Java programming language.
The judge, Ronald Whyte of the United States District Court, amended a preliminary injunction that said Microsoft would be in violation of Sun's copyright on the Java language as well as in violation of its contract with Sun if it shipped products that failed to conform to Sun's standards.
An appeals court had overturned the earlier order because of the copyright element. Judge Whyte dropped that part of the ruling in his amended order.
The ruling came in a lawsuit that Sun filed in October 1997, accusing Microsoft of trying to extend the Java language for special use with its Windows operating system, which Sun contends is a violation of both the contract and the Java copyright.
Michael Morris, general counsel for Sun, said the company was happy with the decision by the judge
Jim Cullinan of Microsoft said the amended ruling showed that Microsoft did not harm competition through its actions but merely that a contract dispute existed over the companies' licensing agreement.
Microsoft is still barred from shipping its versions of Java, and other issues in the suit remain unresolved.
Java, introduced by Sun in 1995, allows developers to write a software application that can run on a variety of computers, regardless of the underlying system. Sun has tried to promote Java as a universal programming language.

Notes

See
Brief intro to HTML.
Brief intro to XML.
0 Brief intro to RDS ADO [ASP, 25Feb 2000].
XSL information
See also the references.