Project meeting at Stanford Date: 02/25/99 Attendees: Svetlozar Nestorov, Sebastien Brion and Yue Zhuge Three of us studied the NoDoSE, Northwestern Document Structure Extractor (available from http://shrike.cs.nwu.edu/nodose ), developed by Stanford alumni Brad Adelberg and his students. The objective of this system is to extract structure from semi-structured test using an interactive tool. Svetlozar showed us the system trial version 1.0, using both the example files provided and some of our own files. We found out that (for this version) only very simple, regular files can be correctly parsed, for example, files with records separated by blank lines. However, the interactive idea is an interesting one to deal with semi-structured data with inconsistencies. (P.S., I talked with Brad later and he promised to send us a later and better version of DoNoSE.) Before the meeting, Sebastien and Yue tried to display XML documents in IE 5.0 Beta, here are a few things we found: * IE5.0 Beta can display well formed XML documents. If a document is not well formed, the browser will return an error and try to point out where the error occurred. * IE5.0 Beta can display XML documents with DTDs, and check whether a document confines its DTD. * Without a style sheet, a document is displayed "as it is", that is, everything including declarations, tags and angle brackets are all displayed. Text is displayed in bold. * Simple style sheet examples work fine, for example, one can change the fonts or font color using a style sheet. Yue also tried the XML editor from Microsoft: XML Notepad. This is a very simple editor, one can use it to write well formed XML documents by providing tags and element values. It returns an error if a document it tries to open is not well formed, so it can not be used to convert (not well formed) HTML documents to XML.