Glynn,
Can you also put this important information into the WIKI programming guide so that it doesn't get lost as easily?
Michael
On Aug 19, 2008, at 3:46 PM, <grass-dev-request@lists.osgeo.org> <grass-dev-request@lists.osgeo.org > wrote:
Date: Tue, 19 Aug 2008 17:55:03 +0100
From: Glynn Clements <glynn@gclements.plus.com>
Subject: [GRASS-dev] HTML files
To: <grass-dev@lists.osgeo.org>
Message-ID: <18602.64231.566897.339008@cerise.gclements.plus.com>
Content-Type: text/plain; charset=us-asciiI have been through and fixed some problems which prevented some of
the HTML files from validating. AFAICT, everything now validates (with
the sole exception of missing "alt" attributes within <img> tags).Please ensure that all HTML files continue to validate against the
HTML 4.0 Transitional DTD. At some point, I want to replace g.html2man
with something more robust (e.g. something which handles tables), and
I don't particularly want to make a "smart" (i.e. fault-tolerant) HTML
parser (e.g. Beautiful Soup) a required dependency.If you have OpenSP or OpenJade, you can validate an HTML file with
e.g.:nsgmls -s -c /usr/share/sgml/openjade-1.3.2/pubtext/HTML4.soc <filename>.html
[The program may be called nsgmls or onsgmls, and the exact location
where the catalogues are installed will vary.]This needs to be done on the completed HTML file in
dist.<arch>/docs/html; the <module>.html files in the module
directories won't normally validate, as they lack the header which is
added by running the module with the --html-description.FWIW, the most common error was using block elements (e.g. <div>,
<pre>, <p>) in contexts where only inline elements are allowed
(primarily <dt>).You can determine which elements are allowed where from the DTD:
http://www.w3.org/TR/1998/REC-html40-19980424/sgml/loosedtd.html
E.g. the definition:
<!ELEMENT DT - O (%inline;)* -- definition term -->
indicates that only inline elements are allowed inside DT, while e.g.:
<!ELEMENT DD - O (%flow;)* -- definition description -->
indicates that both block and inline elements are allowed inside DD.
If you don't want to read the DTD, here's a rough summary:
Entity classes:
%StyleSheet = <CSS stylesheet>
%Script = <JavaScript code>
%html.content = HEAD, BODY
%head.content = TITLE, ISINDEX, BASE
%heading = H1, H2, H3, H4, H5, H6
%fontstyle = TT, I, B, U, S, STRIKE, BIG, SMALL
%phrase = EM, STRONG, DFN, CODE, SAMP, KBD, VAR, CITE, ABBR,
ACRONYM
%special = A, IMG, APPLET, OBJECT, FONT, BASEFONT, BR, SCRIPT,
MAP, Q, SUB, SUP, SPAN, BDO, IFRAME
%formctrl = INPUT, SELECT, TEXTAREA, LABEL, BUTTON
%list = UL, OL, DIR, MENU
%head.misc = SCRIPT, STYLE, META, LINK, OBJECT
%pre.exclusion = IMG, OBJECT, APPLET, BIG, SMALL, SUB, SUP,
FONT, BASEFONT
%preformatted = PRE
%block = P, DL, DIV, CENTER, NOSCRIPT, NOFRAMES,
BLOCKQUOTE, FORM, ISINDEX, HR, TABLE, FIELDSET,
ADDRESS, %heading, %list, %preformatted
%inline = #PCDATA, %fontstyle, %phrase, %special, %formctrl
%flow = %block, %inlineThe immediate children permitted for each element are:
A: %inline
ABBR: %inline
ACRONYM: %inline
ADDRESS: %inline, P
APPLET: %flow, PARAM
B: %inline
BDO: %inline
BIG: %inline
BLOCKQUOTE: %flow
BODY: %flow, INS, DEL
BUTTON: %flow
CAPTION: %inline
CENTER: %flow
CITE: %inline
CODE: %inline
COLGROUP: COL
DD: %flow
DEL: %flow
DFN: %inline
DIR: LI
DIV: %flow
DL: DT, DD
DT: %inline
EM: %inline
FIELDSET: %flow, LEGEND
FONT: %inline
FORM: %flow
FRAMESET: FRAMESET, FRAME, NOFRAMES
H1: %inline
H2: %inline
H3: %inline
H4: %inline
H5: %inline
H6: %inline
HEAD: %head.content, %head.misc
HTML: %html.content
I: %inline
IFRAME: %flow
INS: %flow
KBD: %inline
LABEL: %inline
LEGEND: %inline
LI: %flow
MAP: %block, AREA
MENU: LI
NOFRAMES: %flow
NOSCRIPT: %flow
OBJECT: %flow, PARAM
OL: LI
OPTGROUP: OPTION
OPTION: #PCDATA
P: %inline
PRE: %inline
Q: %inline
S: %inline
SAMP: %inline
SCRIPT: %Script
SELECT: OPTGROUP, OPTION
SMALL: %inline
SPAN: %inline
STRIKE: %inline
STRONG: %inline
STYLE: %StyleSheet
SUB: %inline
SUP: %inline
TABLE: CAPTION, COL, COLGROUP, THEAD, TFOOT, TBODY
TBODY: TR
TD: %flow
TEXTAREA: #PCDATA
TFOOT: TR
TH: %flow
THEAD: TR
TITLE: #PCDATA
TR: TH, TD
TT: %inline
U: %inline
UL: LI
VAR: %inlineSome elements don't allow certain elements as descendents:
A: A
BUTTON: %formctrl, A, FORM, ISINDEX, FIELDSET, IFRAME
DIR: %block
FORM: FORM
LABEL: LABEL
MENU: %block
PRE: %pre.exclusion
TITLE: %head.miscNotes:
1. The children of DIR/MENU are LI, which is a block element, but
those LI can't contain block elements. UL/OL don't have this
restriction.2. DT cannot contain block elements, but DD can. This means that you
can't use <div class="code"><pre> in a DT; use <span class="code"><tt>
instead. DIV and PRE are block elements; SPAN and TT are inline.3. TABLE cannot have TR as a child. But TBODY can have TR, and TBODY
allows both the start and end tags to be omitted, so
<table><tr>....</tr></table> is really just a shorthand for
<table><tbody><tr>....</tr></tbody></table>.4. P cannot contain blocks. So <p>...<div> is actually shorthand for
<p>...</p><div>. But <p>...<div>...</div>...</p> is an error, as the
</p> doesn't match any open element (the <div> implicitly closed the
original <p>, and P doesn't allow the start tag to be omitted).5. HTML, HEAD, BODY, and TBODY allow the start tag to be omitted. With
the exception of TBODY, this feature shouldn't be used (it's a
nuisance to implement if the number of valid child tags is large).--
Glynn Clements <glynn@gclements.plus.com>