Started by Gio Wiederhold, 16 January 2000, updated Jan 23 2002..
Why Binary numbers?
What is the benefit of Binary Number representation. reliability
through simplicity.
Counting with only two symbols {0, 1} .
What about 10 {0, .. , 9}?
What about 3 {-1,0,+1}?
Other base systems: 10, 3,
...
Character sets: ASCII 7 bits = 128
characters - 33 for control. (32 + null) Derived from teletypes. Leaves 95 printable characters
In practice 8 bits = 256 choices, ASCII plus whatever
someone wants.
For more extensive languages there is Uniciode 16 bits 64K choices (1K is 1024 -- why)
Build characters to words or numbers; words or numbers to records or sentences; records or sentences to messages; messages to papers or books; papers or books to knowledge?
Bit Image: Encodings of 2D graphics as height x width pixels -- each pixel has 3 x intensity of color (RGB) or (Luminosity, .. ) A variety of standards, GIF, BMP, JPEG (why)
Specify layout, type of print, bold, italic, size, headers, paragraph boundaries, tables, etc.
with otherwise invisible Commands as <B>boldface stuff</B>.
Hyper (multi-linked) Text (documents) Markup (with format annotations) Language, Used to markup documents so they can be easily shown on a variety of computer devices, and reference ( HREF ) local and remote documents and images. Remote documents require a computer address (http://www.somewhere.xxx ) so they can be found.
Paper: arbitrarily structured/unstructured; physical order.
Books: somewhat structured/unstructured; layout order; metadata: ToC, index.
Tables: very structured. Exceptions awkward -- footnotes
Databases: very structured. Machine processable, queryable. Exceptions awkward.
relational: tabular based, links by
references, join operator; unordered. student|><|course-info
object-oriented: tree-based, structural (and optional reference) links; ordered
(often)
SGML: for document printing,
hierarchically structured; ordered
HTML: for document transmittal, varied presentation, hierarchically structured
+ links; ordered
Three older inventions combined:
Two Technologies:
and a business requisite
A community of high-energy physicists who
Browser competition [Clark-Netscape] [Gates-Microsoft]
Reading: Bring in a simple HTML web document (like this one), and see what it looks like
If you look at a `commercial' web page you will find many
markups that we won't have to care about. Make notes about the ones that puzzle
you and discuss them in class. The essential ones are listed in our CS99I HTML notes.
Doing, indirectly: Create a document with, say, Microsoft Word, save it
as HTML, and look at it.
Doing, directly: Create a document with HTML markups yourself, as shown
in the notes, and then save it as text. May be easiest to use a dumb editor, as
Wordpad, Notepad on PCs or vi, Emacs on UnIX.
Change (rename) the postfix from .txt to .html, and then look at what you have created.
Advantages and Limits
Reliability
Readability
Processability
Granularity
-- (structure: word, line, paragraph, chapter, book )
-- (object: value, name-value pair, item, person, group, community ) with
alternatives (family vs dorm)
See also the references.