michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0: HTML
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0: HTML
michael@0: This documents describes the complete handling of HTML in magellan. The
michael@0: document covers the parsing process - how HTML is lexically analysized
michael@0: and then interprted. After the parsing process is discussed we give a detailed
michael@0: analysis of each HTML tag and the attributes that are supported, the values
michael@0: for the attributes and how the tag is treated by magellan.
michael@0:
michael@0: Parsing
michael@0: HTML is tokenized by an HTML scanner. The scanner is fed unicode data to
michael@0: parse. Stream converters are used to translate from various encodings to
michael@0: unicode. The scanner separates the input stream into tokens which consist
michael@0: of:
michael@0:
michael@0: -
michael@0: text
michael@0:
michael@0: -
michael@0: tags
michael@0:
michael@0: -
michael@0: entities
michael@0:
michael@0: -
michael@0: script-entities
michael@0:
michael@0: -
michael@0: comments
michael@0:
michael@0: -
michael@0: conditional comments
michael@0:
michael@0: The HTML parsing engine uses the HTML scanner for lexical anlaysis. The
michael@0: parsing engine operates by attacking the input stream in a set of well
michael@0: defined steps:
michael@0:
michael@0: -
michael@0: The parser processes the head portion of the document first, without emitting
michael@0: any output. This is done to discover a few special features of html:
michael@0:
michael@0:
michael@0: -
michael@0: The parser processes META tags looking for META TARGET
michael@0:
michael@0: -
michael@0: The parser processes META tags looking for META tags which affect the character
michael@0: set. Nav4 handles the very first character set defining meta tag (all others
michael@0: are ignored) by reloading the document with the proper character conversion
michael@0: module inserted into the stream pipeline.
michael@0:
michael@0:
michael@0: -
michael@0: After the head portion is processed the parser then proceeds to process
michael@0: the body of the document
michael@0:
michael@0:
michael@0:
michael@0: Tag Processing
michael@0: Tags are processed by the parser by locating a "tag handler" for
michael@0: the tag. The HTML parser serves as the tag handler for all of the builtin
michael@0: tags documented below. Tag attribute handling is done during translation
michael@0: of tags into content. This mapping translates the tag attributes into content
michael@0: data and into style data. The translation to style data is documented below
michael@0: by indicating the mapping from tag attributes to their CSS1 (plus extensions)
michael@0: equivalents.
michael@0:
michael@0: Special Hacks
michael@0: The following list describes hacks added to the magellan parsing engine
michael@0: to deal with navigator compatibility. These are just the parser hacks,
michael@0: not the layout or presentation hacks. Most hacks are intriduced for HTML
michael@0: syntax error recovering. HTML doesn't specify much how to handle those
michael@0: error conditions. Netscape has made big effort to render pages with non-prefect
michael@0: HTML. For many reasons, new browsers need to keep compatible in thsi area.
michael@0:
michael@0: -
michael@0: Entities can be used as escape in quoted string. For value string in name-value
michael@0: pair, see compatibility test
michael@0: quote001.html. Test line 70 shows that an entity quote at the begining
michael@0: means the value is NOT quoted. Test line 90 shows that if the value is
michael@0: started with a quote, then an entity quote does NOT terminate the value
michael@0: string.
michael@0:
michael@0: -
michael@0: Wrapping tags are special tags such as title, textarea, server, script,
michael@0: style, and etc.. The comment in ns\lib\libparse\pa_parse.c says:
michael@0:
michael@0:
/*
michael@0:
* These tags are special in that, after opening one of
michael@0: them, all other tags are ignored until the matching
michael@0:
* closing tag.
michael@0:
*/
michael@0:
During the searching of an end tag, comments and quoted strings are
michael@0: observed. see compatibility test title01.html.
michael@0: 6.0 handles comments now, need to add quoted string.
michael@0: -
michael@0: If a <tr> or <td> tag is seen outside any <table> scope, it is
michael@0: ignored. see compatibility test table110.htm.
michael@0:
michael@0: -
michael@0: In case of table in table but not in cell, table
michael@0: tags before the last table tag are ignored. We found this problem in some
michael@0: Netscape public pages, see bug #85118. For example, <table> <table
michael@0: border> .....,or <table> <tr> <table border>..., the table
michael@0: will be displayed with border. compatibility
michael@0: test table201.html. There table and tr tags are buffered for this recovery.
michael@0: When a TD or CAPTION tag is open, the buffer is flushed out, because we
michael@0: cannot buffer contents of TD or CAPTION for performance and memory constrains.
michael@0: They are subdoc's and can be very big. If we see a <table> outside cell
michael@0: after previous table is flushed out, the new <table> tag is ignored.
michael@0: Nav4.0 can discard previous table in such case. tableall.html
michael@0: is the index for table test cases.
michael@0:
michael@0: -
michael@0: Caption is not a commonly used feature. In Nav4.0, captions can be anywhere.
michael@0: For Captions outside cells, the first one takes effect. For captions inside
michael@0: cells, the last one takes effect, and they also close TD and TR. In 6.0,
michael@0: caption is limited to the standard position: after <table>. Captions
michael@0: in other places are ignored, their contents are treated as text. See test
michael@0: case table05a.html to table05o.html.
michael@0:
michael@0: -
michael@0: For <table> <tr> <tr>, the first <tr>
michael@0: takes effect.
michael@0:
michael@0: -
michael@0: The nav4 parser notices when it hits EOF and it's in the middle of scanning
michael@0: in a comment. When this happens, the parser goes back and looks for an
michael@0: improperly closed comment (e.g. a simple > instead of a -->). If it finds
michael@0: one, it reparses the input after closing out the comment.
michael@0:
michael@0: -
michael@0: XXX Brendan also pointed out that there is something
michael@0: similar done for tags, but I don't recall what it is right now.
michael@0:
michael@0: -
michael@0: When Nav4.0 sees the '<' sign, it searchs for
michael@0: '>', observing quoted values. If it cannot find one till EOF, the '<'
michael@0: sign is treated as text. In Xena 6.0, a limit is set for how far the '>'
michael@0: is searched. the default limit is 4096 char, and there is a API HTMLScanner.setMaxTagLength()
michael@0: to changed it. setting -1 means no limit, which is same as Nav4.0.
michael@0:
michael@0: TODO:
michael@0: Document the mapping of tag attributes into CSS1
michael@0: style, including any new "css1" attributes
michael@0:
michael@0: List of 6.0 features incompatible with 4.0
michael@0:
michael@0: -
michael@0: Navigator 4.0 value string is truncated at 82 characters. XENA60 limit
michael@0: is MAX_STRING_LENGTH = 2000.
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0: Tags (Categorically sorted)
michael@0: All line breaks are conditional. If the x coordinate is at the current
michael@0: left margin then a soft line break does nothing. Hard line breaks are ignored
michael@0: if the last tag did a hard line break.
michael@0:
michael@0: divalign = left | right | center | justify
michael@0:
alignparam = abscenter | left | right | texttop | absbottom
michael@0: | baseline | center | bottom | top | middle | absmiddle
michael@0:
colorspec = named-color | #xyz | #xxyyzz | #xxxyyyzzz | #xxxxyyyyzzzz
michael@0:
clip = [auto | value-or-pct-xy](1..4) (pct of width for even
michael@0: coordinates; pct of height for odd coordinates)
michael@0:
value-or-pct = an integer with an optional %; ifthe percent
michael@0: is present any following characters are ignored!
michael@0:
coord-list = XXX
michael@0:
whitespace-strip = remove leading and
michael@0: trailing and any embedded whitespace that is not an actual space (e.g.
michael@0: newlines)
michael@0:
michael@0: Head objects:
michael@0: TITLE
michael@0: The TITLE tag is a container tag whose contents are not HTML. The contents
michael@0: are pure text and are processed by the parser until the closing tag is
michael@0: found. There are no attributes on the tag and any whitespace present in
michael@0: the tag is compressed down with leading and trailing whitespace eliminated.
michael@0: The first TITLE tag found by the parser is used as the document's title
michael@0: (subsequent tags are ignored).
michael@0: BASE
michael@0: Sets the base element in the head portion of the document. Defines
michael@0: the base URL for all? links in the document.
michael@0:
Attributes:
michael@0: HREF=url [This is an absolute URL]
michael@0:
TARGET=string [must start with XP_ALPHA|XP_DIGIT|underscore
michael@0: otherwise nav4 ignores it]
michael@0:
michael@0: META
michael@0: Can define several header fields (content-encoding, author, etc.)
michael@0:
Attributes:
michael@0: REL=SMALL_BOOKMARK_ICON|LARGE_BOOKMARK_ICON
michael@0:
michael@0: HTTP-EQUIV="header: value"
michael@0:
michael@0:
michael@0: HTTP-EQUIV values (from libnet/mkutils.c NET_ParseMimeHeader):
michael@0: ACCEPT-RANGES
michael@0:
CONTENT-DISPOSITION
michael@0:
CONTENT-ENCODING
michael@0:
CONTENT-RANGE
michael@0:
CONTENT-TYPE [ defines character set only ]
michael@0:
CONNECTION
michael@0:
DATE
michael@0:
EXPIRES
michael@0:
EXT-CACHE
michael@0:
LOCATION
michael@0:
LAST-MODIFIED
michael@0:
LINK
michael@0:
PROXY-AUTHENTICATE
michael@0:
PROXY-CONNECTION
michael@0:
PRAGMA
michael@0:
RANGE
michael@0:
REFRESH
michael@0:
SET-COOKIE
michael@0:
SERVER
michael@0:
WWW-AUTHENTICATE
michael@0:
WWW-PROTECTION-TEMPLATE
michael@0:
WINDOW-TARGET
michael@0: Style sheets and HTML w3c spec adds this:
michael@0: CONTENT-STYLE-TYPE [ last one wins; overrides header from server if
michael@0: any ]
michael@0:
michael@0: LINK
michael@0: List related resources. Used by extensions mechanism to find tag handlers.
michael@0: /LINK == LINK!
michael@0:
Attributes:
michael@0: REL=FONTDEF
michael@0:
michael@0: REL=STYLESHEET [ If MEDIA param is defined it must ==nc screen ]
michael@0: LANGUAGE=LiveScript|Mocha|JavaScript1.1|JavaScript1.2
michael@0:
TYPE="text/javascript" | "text/css"
michael@0:
HREF=url
michael@0:
ARCHIVE=url
michael@0:
CODEBASE=url
michael@0:
ID=string
michael@0:
SRC=url
michael@0:
michael@0: Note: HREF takes precedence over SRC in nav4.
michael@0: HEAD
michael@0: /HEAD clears the "in_head" flag (but leaves the "in_body" flag alone.
michael@0:
Text in head clears in_head, and set in_body true, just as if the author
michael@0: forgot the /HEAD tag.
michael@0:
Attributes: none
michael@0: HTML
michael@0: Ignored.
michael@0:
Attributes: none
michael@0: STYLE
michael@0: Allowed anywhere in the document. Note that entities are not parsed
michael@0: in the style tag's content.
michael@0:
Attributes:
michael@0: LANGUAGE=LiveScript|Mocha|JavaScript1.1|JavaScript1.2
michael@0:
TYPE="text/javascript" | "text/css"
michael@0:
HREF=url
michael@0:
ARCHIVE=url
michael@0:
CODEBASE=url
michael@0:
ID=string
michael@0:
SRC=url
michael@0:
michael@0: FRAMESET
michael@0: Frameset with rows=1 and cols=1 is ignored.
michael@0:
Attributes:
michael@0: FRAMEBORDER= no | 0 (zero) [default is no_edges=false]
michael@0:
BORDER= int [clamped: >= 0 && <= 100]
michael@0:
BORDERCOLOR= color
michael@0:
ROWS= pct-list
michael@0:
COLS= pct-list
michael@0:
michael@0: FRAME
michael@0: Border width of zero disables edges.
michael@0:
Attributes:
michael@0: FRAMEBORDER= no | 0 (zero) [default is framesets value]
michael@0:
BORDER= int [clamped; >= 0 && <= 100]
michael@0:
BORDERCOLOR= color
michael@0:
NORESIZE= true [default is false]
michael@0:
SCROLLING= yes | scroll | on | no | noscroll | off
michael@0:
SRC= url [clamped: prevent recursion by eliminating any anscestor
michael@0: references]
michael@0:
NAME= string
michael@0:
MARGINWIDTH= int (clamped: >= 1)
michael@0:
MARGINHEIGHT= int (clamped: >= 1)
michael@0:
michael@0: NOFRAMES
michael@0: Used when frames are disabled or for backrev browsers. Has no stylistic
michael@0: consequences.
michael@0:
michael@0:
michael@0:
michael@0:
Body objects:
michael@0: BODY
michael@0: The tag is only processed on open tags and it is always processed.
michael@0: See ns\lib\layout\laytags.c, searching for "case P_BODY". During tag processing
michael@0: the in_head flag is set to false and the in_body flag is set to true. An
michael@0: attribute is ignored if the document already has that attribute set. Attributes
michael@0: can be set by style sheets, or by previous BODY tags. see test
michael@0: head02.html.
michael@0:
Attributes:
michael@0: MARGINWIDTH=int [clamped: >= 0 && < (windowWidth/2
michael@0: - 1)]
michael@0:
MARGINHEIGHT=int [clamped: >= 0 && < (windowHeight/2
michael@0: - 1)]
michael@0:
BACKGROUND=url
michael@0:
BGCOLOR=colorspec
michael@0:
TEXT=colorspec
michael@0:
LINK=colorspec
michael@0:
VLINK=colorspec
michael@0:
ALINK=colorspec
michael@0:
ONLOAD, ONUNLOAD, UNFOCUS, ONBLUR, ONHELP=script
michael@0:
ID=string
michael@0:
michael@0: LAYER, ILAYER
michael@0: Open layer/ilayer tag automaticly close out an open form if one is
michael@0: open. It does something to the soft linebreak state too.
michael@0:
Attributes:
michael@0: LEFT=value-or-pct (pct of right-left margin)
michael@0:
PAGEX=x (if no LEFT)
michael@0:
TOP=value-or-pct
michael@0:
PAGEY=y (if no TOP)
michael@0:
CLIP=clip
michael@0:
WIDTH=value-or-pct (pct of right-left margin)
michael@0:
HEIGHT=value-or-pct
michael@0:
OVERFLOW=string
michael@0:
NAME=string
michael@0:
ID=string
michael@0:
ABOVE=string
michael@0:
BELOW=string
michael@0:
ZINDEX=int [any value]
michael@0:
VISIBILITY=string
michael@0:
BGCOLOR=colorspec
michael@0:
BACKGROUND=url
michael@0:
michael@0: NOLAYER
michael@0: Container for content which is used when layers are disabled or unsupported.
michael@0: The content has no style consequences (though it could if somebody stuck
michael@0: in some CSS1 style rules for it).
michael@0: P
michael@0: Closes the paragraph. If the attribute is present then an alignment
michael@0: gets pushed on the alignment stack. All values are supported by nav4.
michael@0:
Attributes:
michael@0:
michael@0:
michael@0: ADDRESS
michael@0: There are no attributes. ADDRESS closes out the open paragraph. The
michael@0: open tag does a conditional soft line break and then pushes a merge of
michael@0: the current style with italics enabled onto the style stack. The close
michael@0: always pop the style stack and also does a conditional soft line break.
michael@0: PLAINTEXT, XMP
michael@0: PLAINTEXT causes the remaining content to no longer be parsed. XMP
michael@0: causes the content to not parse entities or other tags. The XMP can be
michael@0: closed by it's own tag (on any boundary); PLAINTEXT is not closed (html3.2
michael@0: allows it to be closed). Both tags change the style to a fixed font of
michael@0: a
michael@0: LISTING
michael@0: Closes the paragraph. Does a hard line break on open and close. Open
michael@0: pushes a fixed width font style of a particular font size on the style
michael@0: stack. The close tag pops the top of the style stack.
michael@0:
Attributes: none
michael@0: PRE
michael@0: Closes the paragraph. The open tag does a hard line break. A fixed
michael@0: font style (unless VARIABLE is present) is pushed on the style stack. The
michael@0: close tag pops the top of the style stack. It also does a hard line break.
michael@0:
Attributes:
michael@0: WRAP
michael@0:
COLS=int [clamped: >= 0]
michael@0:
TABSTOP=int [clamped: >= 0; clamped value is replaced with default
michael@0: value]
michael@0:
VARIABLE
michael@0:
michael@0: NOBR
michael@0: This tag doesn't nest. Instead it just sets or clears a flag in the
michael@0: state machine. It has no effect on any other state.
michael@0: CENTER
michael@0: Closes the paragraph. Always does a conditional soft line break. The
michael@0: open tag pushes an alignment on the aligment stack. The close tag pops
michael@0: the top alignment off.
michael@0:
Attributes: none
michael@0: DIV
michael@0: Closes the paragraph. Always does a conditional soft line break. COLS
michael@0: defines the number of columns to layout in (like MULTICOL). The open tag
michael@0: pushes an alignment on the alignment stack (if COLS > 1 then it pretends
michael@0: to be a MULTICOL tag). The close tag pops an aligment from the alignment
michael@0: stack.
michael@0:
Attributes:
michael@0: ALIGN=divalign
michael@0:
COLS=int [if cols > 1 then DIV acts like a MULTICOL tag else
michael@0: DIV is just a container]
michael@0: GUTTER= int (clamped: >= 1)
michael@0:
WIDTH= value-or-pct [pct of right-left margin; clamped >= 1/0
michael@0: (strange code)]
michael@0:
michael@0:
michael@0: H1-H6
michael@0: Closes the paragraph. The open tag does a hard line break and pushes
michael@0: a style item which enables bold and disables fixed and italic. The close
michael@0: tag always pops the top item from the style stack. It also does a hard
michael@0: line break. If the ALIGN attribute is present then the open tag
michael@0: pushes an alignment on the alignment stack. The close tag will look at
michael@0: the top of the alignment stack and if its a header of any kind (H1 through
michael@0: H6) then the alignment is popped. In either case the close tag also does
michael@0: a conditional soft line break (this happens before the hard line break).
michael@0:
Attributes:
michael@0:
michael@0:
michael@0: A note regarding closing paragraphs: Any time a close paragraph is done
michael@0: (for any tag) if the top of the alignment stack has a tag named "P" then
michael@0: a conditional soft line break is done and the alignment is popped.
michael@0:
michael@0:
michael@0:
michael@0: TABLE
michael@0: Close the paragraph.
michael@0:
Attributes:
michael@0: ALIGN=left|right|center|abscenter
michael@0:
BORDER=int [clamped: if null then -1, if < 1 then 1 ]
michael@0:
BORDERCOLOR=string [if not supplied then set to the text color
michael@0: ]
michael@0:
VSPACE=int [ clamped: >= 0 ]
michael@0:
HSPACE=int [ clamped: >= 0 ]
michael@0:
BGCOLOR=color
michael@0:
BACKGROUND=url
michael@0:
WIDTH=value-or-pct [ % of win.width minus margins; clamped:
michael@0: >= 0 ]
michael@0:
HEIGHT=value-or-pct [ % of win.height minus margins; clamped:
michael@0: >= 0 ]
michael@0:
CELLPADDING=int [clamped: >= 0; separate pads take precedence
michael@0: ]
michael@0:
TOPPADDING= int [clamped: >= 0 ]
michael@0:
BOTTOMPADDING= int [clamped: >= 0 ]
michael@0:
LEFTPADDING= int [clamped: >= 0 ]
michael@0:
RIGHTPADDING= int [clamped: >= 0 ]
michael@0:
CELLSPACING= int [clamped: >= 0 ]
michael@0:
COLS=int [clamped: >= 0]
michael@0: The code supports more attributes in the Table attribute handler than it
michael@0: does in the code that gets the attributes from the tag! They are border_top,
michael@0: border_left, border_right, border_bottom, border_style (defaults to outset;
michael@0: allows for outset/dotted/none/dashed/solid/double/groove/ridge/inset).
michael@0: TR
michael@0: Open TR automatically closes an open table row (and an open table cell
michael@0: if one is open). It also automatically closes a CAPTION tag.
michael@0:
Attributes:
michael@0: BGCOLOR=color
michael@0:
BACKGROUND=url
michael@0:
VALIGN=top|bottom|middle|center(==middle)|baseline; default
michael@0: is top
michael@0:
ALIGN=left|right|middle|center(==middle); default is left
michael@0:
michael@0: TH, TD
michael@0: If no table then the tag is ignored (open or close). If no row is currently
michael@0: opened or the current row is current done (because of a </TR> tag) then
michael@0: a new row is begun. Oddly enough the tag parameters for the row come from
michael@0: the TH/TD tag in this case. An open of either of these tags will automatically
michael@0: close the previous cell.
michael@0:
Attributes:
michael@0: COLSPAN=int [clamped: >= 1 && <= 1000 ]
michael@0:
ROWSPAN=int [clamped: >= 1 && <= 10000 ]
michael@0:
NOWRAP [boolean: disables wrapping ]
michael@0:
BGCOLOR=color [default: inherit from the row; if not row then
michael@0: table; if not table then inherit from an outer table cell; this works because
michael@0: the style is flattened so the outer table cell will have a color]
michael@0:
BACKGROUND=url [same rules as bgcolor for inheritance; tile
michael@0: mode is inherited too and not settable by TH/TD attributes (have to use
michael@0: style sheets for that)]
michael@0:
VALIGN=top|bottom|middle|center(==middle)|baseline; default
michael@0: is top
michael@0:
ALIGN=left|right|middle|center(==middle); default is left
michael@0:
WIDTH=value-or-pct [ clamped: >= 0 ]
michael@0:
HEIGHT=value-or-pct [ clamped: >= 0 ]
michael@0:
michael@0: CAPTION
michael@0: An open caption tag will automatically close an open table row (and
michael@0: an open cell).
michael@0:
Attributes:
michael@0:
michael@0: The code sets the vertical alignment to top w/o providing a mechanism for
michael@0: the user to set it (there is no VALIGN attribute).
michael@0: MULTICOL
michael@0: The open tag does a hard line break. The close tag checks to see if
michael@0: the state machine has an open multicol and if it does then it does a conditional
michael@0: soft line break and then continues to break until both margins are cleared
michael@0: of floating elements. It recomputes the margins based on the list indenting
michael@0: level (?). After the synthetic table is output the close tag does a hard
michael@0: line break.
michael@0:
michael@0: This tag will treat the input as source for a table with one row and
michael@0: COLS columns. The data is laid out using the width divided by the number
michael@0: of columns. After the total height is known, the content is partitioned
michael@0: as evenly as possible between the columns in the table.
michael@0:
Attributes:
michael@0:
COLS=int [clamped: values less than 2 cause the tag to be ignored]
michael@0:
GUTTER=int [clamped: >= 1]
michael@0:
WIDTH=value-or-pct [pct of right-left margin; clamped: >= 1/0
michael@0: (strange code)]
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0: BLOCKQUOTE
michael@0: Closes the paragraph. The open tag does a hard line break. A list with
michael@0: the empty-bullet style is pushed on the list stack (unless TYPE=cite/jwz
michael@0: then a styled list is pushed). The close tag pops any list and does a hard
michael@0: line break.
michael@0:
Attributes:
michael@0:
michael@0:
michael@0: UL, OL, MENU, DIR
michael@0: For top-level lists (lists not in lists) a hard break is done on the
michael@0: open tag, otherwise a conditional-soft-break is done. Tag always does a
michael@0: close paragrah. The close tag does a conditional soft line break when nested;
michael@0: when not nested the close tag does a hard line break (even if no list is
michael@0: open). The open tag pushes the list on the list stack. The close tag pops
michael@0: any list off the list stack.
michael@0:
Attributes:
michael@0: TYPE= none | disc | circle | round | square | decimal | lower-roman
michael@0: | upper-roman | lower-alpha | upper-alpha | A | a | I | i [clamped: if
michael@0: none of the above is picked and OL then the bullet type is "number" otherwise
michael@0: the bullet type is "basic"]
michael@0:
START=int [clamped: >= 1]
michael@0:
COMPACT
michael@0:
michael@0: DL
michael@0: Closes the paragraph. For the open tag, if the list is nested then
michael@0: a conditional soft line break is done otherwise a hard line break is done.
michael@0: The open tag pushes a list on the list stack. The close tag pops any list
michael@0: from the list stack. Closing the list acts like other lists closes.
michael@0:
Attributes:
michael@0:
michael@0:
michael@0: LI
michael@0: Closes the paragraph. The open tag does a conditional soft line break.
michael@0: Close tags are ignored (except for closing the paragraph).
michael@0:
Attributes:
michael@0: TYPE= A | a | I | i (if the containing list is an OL)
michael@0:
TYPE= round | circle | square (if the containing list is not
michael@0: OL and not DL)
michael@0:
VALUE=int [clamped: >= 1]
michael@0: The magellan html parser allows the full set of list item styles from the
michael@0: OL/DL tag instead of just the limited set that nav4 allows.
michael@0: DD
michael@0: Closes the paragraph. Close tags are ignored (except for closing the
michael@0: paragraph). DD outside a DL just advances the X coordinate of layout by
michael@0: a small constant. DD inside a DL does a conditional soft line break and
michael@0: other margin crud.
michael@0:
Attributes: none.
michael@0: DT
michael@0: Closes the paragraph (open or close). Close tags are otherwise ignored.
michael@0: Does a conditional soft line break. Moves the X layout coordinate to the
michael@0: left margin.
michael@0:
Attributes: none
michael@0:
michael@0:
michael@0:
michael@0:
michael@0: A
michael@0: Open anchors push a style on the style stack if the anchor has an HREF.
michael@0: Close anchors pop as many styles off the top of the style stack that are
michael@0: anchor tags (anchor tags don't nest in other words). In addition, any styles
michael@0: on the stack that have the ANCHOR bit set have it cleared and fiddle with
michael@0: the foreground and background colors.
michael@0:
Attributes:
michael@0: NAME=string
michael@0:
HREF=url
michael@0: TARGET=target
michael@0:
SUPPRESS=true
michael@0:
michael@0:
michael@0: STRIKE, S, TT, CODE, SAMPLE, KBD, B, STRONG, I, EM, VAR, CITE, BLINK,
michael@0: BIG, SMALL, U, INLINEINPUT, SPELL
michael@0: The open tag pushes onto the style stack. The close tag always pops
michael@0: the top item from the style stack.
michael@0:
Attributes: none
michael@0: SUP, SUB
michael@0: The open tag pushes a font size descrease on the style stack. The close
michael@0: tag always pops the top of the style stack. The open and close tag impacts
michael@0: the baselineThe only difference between SUP and SUB is how they impact
michael@0: the baseline. Note that the baseline information is forgotten after a line
michael@0: break; therefore a close SUP/SUB on the next line will do strange things.
michael@0:
Attributes: none
michael@0: SPAN
michael@0: Ignored by the navigator.
michael@0:
Attributes: none
michael@0: FONT
michael@0: The open font tag with no attributes resets the font size to the base
michael@0: font size. The open tag always pushes a style stack entry. The close tag
michael@0: always pops the top item off the style stack.
michael@0:
Attributes:
michael@0: SIZE=[+ int | - int | int ] [clamped: >=1 && <=
michael@0: 7]
michael@0:
POINT-SIZE=[+ int | - int | int ] [clamped: >= 1 &&
michael@0: <= 1600]
michael@0:
FONT-WEIGHT=[+ int | - int | int ] [clamped: >= 100 &&
michael@0: <= 900]
michael@0:
COLOR=colorspec
michael@0:
FACE=string
michael@0:
michael@0: A note regarding the style stack: The pop of the stack checks to see if
michael@0: the top of the stack is an ANCHOR tag. If it is not an anchor then the
michael@0: top item is unconditionally popped. If the top of the style stack is an
michael@0: anchor tag then the code searches for either the bottom of the stack or
michael@0: the first style stack entry not created by an anchor tag. If the entry
michael@0: is followed by another entry then the entry is removed from the stack (an
michael@0: out-of-order pop in other words). In this case the anchor style stack entry
michael@0: is left untouched.
michael@0:
michael@0:
michael@0:
michael@0: text, entities
michael@0: These are basic content objects that get fed directly to the output.
michael@0: In navigator the text is processed by doing line-breaking (entities have
michael@0: been converted to text already by the parser). The line-breaking is controlled
michael@0: by the margin settings and the list depth, the floating elements, the style
michael@0: attributes (font size, etc.), the preformatted flag, the no-break flag
michael@0: and so on.
michael@0: IMG, IMAGE
michael@0: Close tag is ignored.
michael@0:
Attributes:
michael@0: ISMAP
michael@0:
USEMAP=url
michael@0:
ALIGN=alignparam
michael@0:
SRC=url [ whitespace is stripped ]
michael@0:
LOWSRC=url
michael@0:
ALT=string
michael@0:
WIDTH=value-or-pct (pct of right-left width)
michael@0:
HEIGHT=value-or-pct (pct of window height)
michael@0:
BORDER=int [clamped: >= 0]
michael@0:
VSPACE=int [clamped: >= 0]
michael@0:
HSPACE=int [clamped: >= 0]
michael@0:
SUPPRESS=true | false (only in blocked image layout???)
michael@0:
michael@0: HR
michael@0: Closes the paragraph. If an open tag then does a conditional soft line
michael@0: break. The rule inherits alignment from the parent container unless there
michael@0: is no container (then it's centered) or if the tag defines it's own alignment.
michael@0: After the object is inserted into the layout stream a soft line break is
michael@0: inserted as well.
michael@0:
Attributes:
michael@0: ALIGN=divalign (sort of; in laytags.c it's divalign; in layhrule.c
michael@0: it's left or right only)
michael@0:
SIZE=int (1 to 100 inclusive)
michael@0:
WIDTH=val-or-pct (pct of right-left width)
michael@0:
NOSHADE
michael@0:
michael@0: BR
michael@0: Does an unconditional soft break. If clear is set then it will also
michael@0: soft break until either the left or right or both margins are clear of
michael@0: floating elements. Note that /BR == BR!
michael@0:
Attributes:
michael@0: CLEAR=left | right | all | both
michael@0:
michael@0: WBR
michael@0: Soft word break.
michael@0:
Attributes: none
michael@0: EMBED
michael@0: Close tag does nothing. Embed's operate inline just like images (they
michael@0: don't close the paragraph).
michael@0:
Attributes:
michael@0: HIDDEN=no | false | off
michael@0:
ALIGN=alignparam
michael@0:
SRC=url
michael@0:
WIDTH=val-or-pct (pct of right-left width)
michael@0:
HEIGHT=val-of-pct; if val is < 1 (sometimes) the element
michael@0: gets HIDDEN automatically
michael@0:
BORDER=int (unsupported by navigator)
michael@0:
VSPACE=int [clamped: >= 0]
michael@0:
HSPACE=int [clamped: >= 0]
michael@0:
michael@0: NOEBMED
michael@0: Used when EMBED's are disabled. It is a container for regular content
michael@0: that has no stylistic consequences (no line breaking, no style stack effect,
michael@0: etc.).
michael@0: APPLET
michael@0: Applet tags don't nest (there is a notion of current_applet). The open
michael@0: tag automatically closes an open applet tag.
michael@0:
Attributes:
michael@0: ALIGN=alignparam
michael@0:
CODE=string
michael@0:
CODEBASE=string
michael@0:
ARCHIVE=string
michael@0:
MAYSCRIPT
michael@0:
NAME=string [clamped: white space is stripped out]
michael@0:
WIDTH=value-or-pct [pct of right-left width; clamped: >= 1]
michael@0:
HEIGHT=value-or-pct [pct of window height; clamped >= 1]
michael@0:
BORDER=int [clamped: >= 0]
michael@0:
HSPACE=int [clamped: >= 0]
michael@0:
VSPACE=int [clamped: >= 0]
michael@0: If no width is provided:
michael@0: if a height was provided, use the height. Otherwise, use 90% of the
michael@0: window width if percentage widths are allowed, otherwise use a value of
michael@0: 600.
michael@0:
michael@0: If no height is provided:
michael@0: if a width was provided, use the width. Otherwise, use 50% of the window
michael@0: height if percentage widths are allowed, otherwise use a value of 400.
michael@0: If the applet is hidden, then the widht/height get forced to zero.
michael@0: PARAM
michael@0: The param tag is supported when contained by the APPLET tag or the
michael@0: OBJECT tag. It has no stylistic consequences. The attribute values from
michael@0: the tag are passed to the containing APPLET or OBJECT. Note that /PARAM
michael@0: == PARAM.
michael@0:
Attributes:
michael@0: NAME=string [clamped: white space is stripped out]
michael@0:
VALUE=string [clamped: white space is stripped out]
michael@0: White space being stripped is done as follows: leading and trailing whitespace
michael@0: is removed. Any embedded whitespace is left alone except if it's a non-space
michael@0: whitespace in which case it is removed.
michael@0: OBJECT
michael@0: The open tag pushes an object onto the object stack. The close tag
michael@0: pops from the object stack. I don't understand how the data stuff works.
michael@0:
Attributes:
michael@0: CLASSID=string (clsid:, java:, javaprogram:, javabean: are the
michael@0: supported prefixes; maybe it's a url if no prefix shown?)
michael@0:
TYPE=string (a mime type)
michael@0:
DATA=string (data: prefix mentions a url)
michael@0: There are more attributes that depend on the type of object being embedded
michael@0: in the page. If the object is a java bean (?) then the applet parameters
michael@0: are supported:
michael@0: CLASSID
michael@0:
HIDDEN
michael@0:
ALIGN
michael@0:
CLASSID (instead of CODE)
michael@0:
CODEBASE
michael@0:
ARCHIVE
michael@0:
MAYSCRIPT
michael@0:
ID (applets use NAME)
michael@0:
WIDTH
michael@0:
HEIGHT
michael@0:
BORDER
michael@0:
HSPACE
michael@0:
VSPACE
michael@0:
michael@0: MAP
michael@0: The open tag automatically closes an open map (maps don't nest). There
michael@0: is no stylistic consequence of the map nor does it provide any visible
michael@0: presentation in the normal layout case (an editor would do something different).
michael@0: The map can be declared anywhere in the document.
michael@0:
Attributes:
michael@0: NAME=string [clamped: white space is stripped out]
michael@0:
michael@0: AREA
michael@0: Does nothing if there is no current map or the tag is a close tag.
michael@0:
Attributes:
michael@0: SHAPE=default | rect | circle | poly | polygon
michael@0:
ALT=string [clamped: newlines are stripped]
michael@0:
COORDS=coord-list
michael@0:
HREF=url
michael@0: TARGET=target (only if HREF is specified)
michael@0: SUPPRESS
michael@0:
michael@0: SERVER
michael@0: A container for server-side javascript. Not evaluated by the client
michael@0: (parsed and ignored). Note: The navigator parser doesn't expand entities
michael@0: in a SERVER tag.
michael@0: SPACER
michael@0: Close tag is ignored. Open tag provides whitespace during layout: TYPE=line/vert/vertical
michael@0: causes a conditional soft line break and then adds SIZE to the Y
michael@0: layout coordinate. TYPE=word causes a conditional soft word break
michael@0: and then adds SIZE to the X layout coordinate. TYPE=block
michael@0: causes blockish layout stuff to happen.
michael@0:
Attributes:
michael@0: TYPE=line | vert | vertical | block (default: word)
michael@0: ALIGN=alignparam (these 3 params are only for TYPE=block)
michael@0:
WIDTH=value-or-pct
michael@0:
HEIGHT=value-or-pct
michael@0: SIZE=int [clampled: >= 0]
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0: SCRIPT
michael@0: Note: The navigator parser doesn't expand entities in a SCRIPT tag.
michael@0:
Attributes:
michael@0: LANGUAGE=LiveScript | Mocha | JavaScript1.1 | JavaScript1.2
michael@0:
TYPE="text/javascript" | "text/css"
michael@0:
HREF=url
michael@0:
ARCHIVE=url
michael@0:
CODEBASE=url
michael@0:
ID=string
michael@0:
SRC=url
michael@0:
michael@0: NOSCRIPT
michael@0: Used when scripting is off or by backrev browsers. It is a container
michael@0: that has no stylistic consequences.
michael@0:
michael@0:
michael@0:
michael@0:
michael@0: FORM
michael@0: Attributes:
michael@0: ACTION=href
michael@0:
ENCODING=string
michael@0:
TARGET=string
michael@0:
METHOD=get | post
michael@0:
michael@0: ISINDEX
michael@0: This tag is a shortcut for creating a form element with a submit button
michael@0: and a single text field. If the PROMPT attribute is not present in the
michael@0: tag then the value used is "This is a searchable index. Enter search
michael@0: keywords:".
michael@0:
michael@0: Attributes:
michael@0:
PROMPT=string
michael@0:
ACTION=href
michael@0:
ENCODING=string
michael@0:
TARGET=string
michael@0:
METHOD=get | post
michael@0:
michael@0: INPUT
michael@0: Attributes vary according to type:
michael@0: TYPE= text | radio | checkbox | hidden | submit | reset | password
michael@0: | button | image | file | jot | readonly | object
michael@0:
NAME= string
michael@0:
michael@0: TYPE=image
michael@0: attributes are from the IMG tag (!)
michael@0: TYPE= text | password | file
michael@0: font style is forced to fixed
michael@0:
VALUE= string
michael@0:
SIZE= int (clamped; >= 1)
michael@0:
MAXLENGTH= int (not clamped!)
michael@0: TYPE= submit | reset | button | hidden | readonly
michael@0: VALUE=string; default if no value to the attribute varies according
michael@0: to the type:
michael@0: submit -> "Submit Query"
michael@0:
reset -> "Reset"
michael@0:
others -> " " (2 spaces)
michael@0:
Note also that the value has newlines stripped from it
michael@0: WIDTH=int (clamped >=0 && <= 1000) (only for submit,
michael@0: reset or button)
michael@0:
HEIGHT=int (clamped >=0 && <= 1000) (only for submit,
michael@0: reset or button)
michael@0: TYPE=radio | checkbox
michael@0: CHECKED (flag - if present then set to true)
michael@0:
VALUE= string (the default value is "on")
michael@0:
michael@0: SELECT
michael@0: Attributes:
michael@0: MULTIPLE (boolean)
michael@0:
SIZE= int (clamped >= 1)
michael@0:
NAME= string
michael@0:
WIDTH= int (clampled >= 0 && <= 1000)
michael@0:
HEIGHT= int (clamped >= 0 && <= 1000; only examined
michael@0: for single entry lists (!multiple || size==1))
michael@0:
michael@0: OPTION
michael@0: Lives inside the SELECT tag (ignored otherwise).
michael@0:
Attributes:
michael@0: VALUE=string
michael@0:
SELECTED boolean
michael@0:
michael@0: TEXTAREA
michael@0: Attributes:
michael@0: NAME=string
michael@0:
ROWS=int (clamped; >= 1)
michael@0:
COLS=int (clamped; >= 1)
michael@0:
WRAP= off | hard | soft (default is off; any value which is
michael@0: not known turns into soft)
michael@0:
michael@0: KEYGEN
michael@0: Attributes:
michael@0: NAME=string
michael@0:
CHALLENGE=string
michael@0:
PQG=string
michael@0:
KEYTYPE=string
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0: BASEFONT
michael@0: Sets the base font value which +/- size values in FONT tags are relative
michael@0: to.
michael@0:
Attributes:
michael@0: SIZE=+ int | - int | int (just like FONT)
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
Unsupported
michael@0: NSCP_CLOSE, NSCP_OPEN, NSCP_REBLOCK, MQUOTE, CELL, SUBDOC, CERTIFICATE,
michael@0: INLINEINPUTTHICK, INLINEINPUTDOTTED, COLORMAP, HYPE, SPELL, NSDT
michael@0: These tags are unsupported because they are used internally by netscape
michael@0: and are never seen in real content. If somebody does use them between 4.0
michael@0: and magellan, tough beans. We never documented them so they lose.
michael@0:
michael@0:
michael@0: