We describe only a few basic commands of the HyperText Markup Language (HTML). The current common version is HTML 2.0, but 3.0 is often available. In a browser you can inspect or save the source file to learn about the formatting that was used. Not all browsers handle all formats, and they certainly don't treat them the same way.
In order to serve its clientele, microsoft has introduced many commands that can be understood only by it's internet explorer, they allow a more faithful representation of its word and powerpoint files, when saved as html. See the section OTHER below.
HTML is an application conforming to ISO 8879 (Standard Graphic Markup
Language or SGML). SGML uses embedded directives to indicate formatting,
while leaving the interpretation to the client's display program and its
knowledge about the screen, paper, user preferences, etc. These directives
are bracketed by Less-Than(<) and Greater-Than (>) symbols.
In this document we use UPPER CASE for all HTML directives shown, although
lower- and upper-case directives are equivalent.
Browsers may ignore stuff in these <brackets> they don't recognize.
To enable us to show the directives in this HTML document we use internally
some special symbols (see below).
There are also special characters, which start with an ampersand (&).
Each document should start with a declaration
<!Doctype html public "-//W3O//DTD/ W3 HTML 2.0//EN">,
here indicating that the document conforms to HTML version 2.0
(but ignored by most browsers) , followed by
<HTML>.
Most commands have a corresponding closure, for instance there should
be a
</HTML> at end of the document.
A document is split into a HEAD and a BODY.
The HEAD is for external information, as the TITLE, used by the
browser for its frame, and the external name of the page to the browser,
i.e.,
<HEAD><TITLE>HTML information for CS99I book</TITLE>
That can be followed by a reference to the web page's own location, useful when trying
to find out where one has gotten on the web:
<BASE HREF="http://www-db.stanford.edu/pub/gio/CS99I/html-info.html">
</HEAD>
and a BODY, i.e.,
<BODY> followed by everything in the document, until the closing </BODY>,
except for <! declarations not to be displayed >
There are six levels of section headers:
<Hx>heading text</Hx> x = 1..6
We use <H1> for the chapter headings,
<H2> for the major sections,
and <H3> for subsections.
<P> starts a paragraph, to be terminated with </P>,
and
<BR> forces a linebreak (used liberally in this document).
Lists are a of three types:
<yL> list: <UL> unumbered; <OL> numbered; <DL> definition
Each list entry starts with <LI>
and the list is terminated by </yL>.
List commands as <OL>, </OL> are also (mis)used to provide indenting
of text.
Normally you want to leave as much formatting as possible to the browser,
since it will adjust itself to the available page size and customer preferences,
but formatting can be disabled by bracketting
<PRE> preformatted asis </PRE>.
The ability to go to other documents is the main innovation of HTML.
<A HREF="filename"> mousearea </A> as
<A HREF="http://db.stanford.edu/pub/gio/CS99I/intro.html">CS99I Introductory
Chapter</A>
This also works to go to files that are in other formats, if your browser
has the appropriate plugin, say Ghostscript for
<A HREF="http://db.stanford.edu/pub/gio/slides/atarpa.ps">ARPA postscript
slides</A>.
One can also use a hyperlink to go into the middle of a document, if a
name has been given to the desired entrypoint:
<A HREF="#SecSix">Section 6</A> --> <A NAME="SecSix">
(Note: The NAME=definition appears not to work inside of TABLEs)
If you want to go back to How to Write for the Web, click
.
It depends on the browser's plugins (optional software)
what formats can be handled.
To put a 3 pixel border around a worm we can add
<IMG SRC="../gifs/nematode.jpg" border="3">:
.
Use
<A HREF="mailto:gio@cs.stanford.edu">email to: gio@cs.stanford.edu</A>
to insert a mailing address. The text between the the opening <A..> and
the closing </A> is arbitrary.
<BLOCKQUOTE> for quotations</BLOCKQUOTE>
<ADDRESS> for addresses <\ADDRESS>
<CENTER> text </CENTER>
More characters are denoted numerically as &nnn;, where nnn is the sum of
the row and column numbers in the table below:
Note that any characters your browser does not understand come out funny or as entered.
I hope none crash your browser.All 256 1 byte characters
+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | | | | | ||||||||||||||||||||
20 | | | ! | " | # | $ | % | & | ' | | | |||||||||||||
40 | ( | ) | * | + | , | - | . | / | 0 | 1 | | | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | | |
60 | < | = | > | ? | @ | A | B | C | D | E | | | F | G | H | I | J | K | L | M | N | O | | |
80 | P | Q | R | S | T | U | V | W | X | Y | | | Z | [ | \ | ] | ^ | _ | ` | a | b | c | | |
100 | d | e | f | g | h | i | j | k | l | m | | | n | o | p | q | r | s | t | u | v | w | | |
120 | x | y | z | { | | | } | ~ | | | | | | | | | | | | | | | | | |
140 | | | | | | | | | | | | | | | | | | | | | | | | |
160 | ¡ | ¢ | £ | ¤ | ¥ | ¦ | § | ¨ | © | | | ª | « | ¬ | | ® | ¯ | ° | ± | ² | ³ | | | |
180 | ´ | µ | ¶ | · | ¸ | ¹ | º | » | ¼ | ½ | | | ¾ | ¿ | À | Á | Â | Ã | Ä | Å | Æ | Ç | | |
200 | È | É | Ê | Ë | Ì | Í | Î | Ï | Ð | Ñ | | | Ò | Ó | Ô | Õ | Ö | × | Ø | Ù | Ú | Û | | |
220 | Ü | Ý | Þ | ß | à | á | â | ã | ä | å | | | æ | ç | è | é | ê | ë | ì | í | î | ï | | |
240 | ð | ñ | ò | ó | ô | õ | ö | ÷ | ø | ù | | | ú | û | ü | ý | þ | ÿ | | |
Styles, relative sizes, and colors can be indicated,
but your browser chooses the actual representation.
<FONT with options to increase the size, say as <FONT SIZE=+1> by 1, SIZE=+1 until </FONT>
The type of font can be changed using the attribute face, as face="ARIAL"
and/or set the COLOR=BLUE> until </FONT>
<EM> Emphasis italics <EM> ; we use these for words cited in
the glossary.
<STRONG> Strong emphasis italics <STRONG>
<CITE> book, journal citation italics <CITE>
<KBD> typing font <KBD>; we use these for examples of type-ins.
<VAR> substitution example font </VAR>
<B> bold <B>
<I> italic <I>
<TT> typewriter <TT>
We just show a summary example.
<TABLE> <TABLE BORDER=3> <TABLE CELLSPACING=2 (standard)>
<CAPTION> one line only, centered, plain, last line wins</CAPTION>
<TR><TH>a row of centered (default) header items <TH> more <TH> for
as many columns as wanted
<TH WIDTH=pixels or WIDTH=percent%>, CENTER is the default.
<TR><TD>a row of data fields <TD> more data
<TD> field with left-aligned data (default)
<TR> more rows, joint field width automatic, multi line automatic<TD>
<TD>
<TR>more rows
<TD or TH options include
By default the alignment of tables is done automatically, an example without SPACING, WIDTH, ALIGN, or SPAN options is seen above in the table of characters.
The Paragraph bracket allow setting of a variety of style, but their interpretation can vary by browser. Useful are:
<P STYLE='MARGIN-LEFT:0.5in'> to provide a half inch indent, used here
<P STYLE='TAB-STOPS:2.5in'> ?? how to use ?? ;
<TAB ID=tabname> and <TAB TO=tabname%gt; seems only proposed;
multiple style entries can be placed within the quotes and separated
with a semicolon (;).
text-of-length-for-tab
Tables can be very long, look for instance at the list of
a list
of all 84 Hitchcock movies has been manually split into 4 distinct tables.
Long tables take long to load, and are hard to manage with scrollbars.
We can use an option, DATAPAGESIZE, in the TABLE specification to
to split the presentation of a long table, as
There are two levels of comments;
Hit's since 15 Jan 2000:
More information about such a counter can be found at the Hitometer counter's home page.
To check HTML files for correctness, you may want to use a HTML checker. One was made by an independent company (Web Site Garage), bought by Netscape late 1998. It is now (Jan 2002) at Web Site Garage of Netscape.
<meta> introduces meta commands, meant for search engines, and hidden
from the user. They are often misused to cause high rankings, as
<meta Money Money Money Money > to give the impression that this
web page is financially worthwhile.
Current version of Microsoft Word and Powerpoint have the option to convert their documents to HTML, and vice versa. But since the capabilities of the HTML browsers don't match the capabilities of MS Word, the result often is imperfect. Some subsequent manual editing can make stuff look much better.
<Style> introduces the use of style templates to improve the look,
but make HTML musch less general.
Specific Microsoft styles within the style section are bracketed by
Items that require XML processing are introduced with
See also the CS99I references.