[GRASS-dev] Manpage HTML markup consistency

Glynn Clements wrote:

Also, although g.html2man attempts to handle tables, it doesn't
seem to work.

I think it's just missing a line break,

Index: tools/g.html2man/g.html2man

--- tools/g.html2man/g.html2man (revision 30354)
+++ tools/g.html2man/g.html2man (working copy)
@@ -52,7 +52,7 @@
   $ncols = -1 + scalar split /<td/i, $one_row;
   $has_header = m|<th|i ;
   s|[\n\r]||g;
- s|</tr.*?>|\n|gis;
+ s|</tr.*?>|\n.br\n|gis;
   s|</td.*?>|\t|gis;
   foreach $tag ( qw(<td.*?> <th.*?> </th> <tr.*?>) ){
     s/$tag//gi;

?

editing the r.in.xyz.1 file by hand with that modification makes the
table look nice in the man page. groff has some tbl() function too, but
I'm not at all familiar with how to use it or how portable it is.

from within the script:
# TODO: rewrite g.html2man using the HTML::Parser module

As for rewriting with Python, sure that could be really good. But as
Python will not be a mandatory dependency until GRASS 7 the g.html2man
shell script should remain as the default for the 6.x branches.

Hamish

      ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

Maris Nartiss wrote:

Sorry for being from year 2008, but how many GRASS users still use man
pages to read how to use GRASS modules? There are some really good
text-based web browsers (links - my favorite), that support tables
etc. thus eliminating need of man pages at all.

As an XEmacs user, I find e.g.

  C-h RET d.vect RET

faster and easier to use than

  M-x w3-find-file RET /opt/grass-6.3.svn/docs/html/d.vect.html RET

even with filename completion.

Also, the manpages don't have the blinding white background of the
GRASS HTML files.

--
Glynn Clements <glynn@gclements.plus.com>

Dylan Beaudette wrote:

I wonder if now would be a good time to investgate the use of CSS in the man
pages. If we define a couple types of "container" objects (<div>, <span>,
etc) we can use a single style file to later manipulate the look and feel of
the manual pages.

If you're going to overhaul the documentation, I suggest going all the
way and using something which is intended to be used as a source for
multiple formats (at least HTML and nroff, with one or more of TeX,
PDF and PostScript as options), e.g. DocBook.

--
Glynn Clements <glynn@gclements.plus.com>

Hamish wrote:

> Also, although g.html2man attempts to handle tables, it doesn't
> seem to work.

I think it's just missing a line break,

Index: tools/g.html2man/g.html2man

--- tools/g.html2man/g.html2man (revision 30354)
+++ tools/g.html2man/g.html2man (working copy)
@@ -52,7 +52,7 @@
   $ncols = -1 + scalar split /<td/i, $one_row;
   $has_header = m|<th|i ;
   s|[\n\r]||g;
- s|</tr.*?>|\n|gis;
+ s|</tr.*?>|\n.br\n|gis;
   s|</td.*?>|\t|gis;
   foreach $tag ( qw(<td.*?> <th.*?> </th> <tr.*?>) ){
     s/$tag//gi;

?

editing the r.in.xyz.1 file by hand with that modification makes the
table look nice in the man page. groff has some tbl() function too, but
I'm not at all familiar with how to use it or how portable it is.

Inserting .br between lines would certainly be an improvement.

Unfortunately, the above change doesn't achieve that; all of the
table-related tags have been stripped by the point that CvtTable is
called.

As for rewriting with Python, sure that could be really good. But as
Python will not be a mandatory dependency until GRASS 7 the g.html2man
shell script should remain as the default for the 6.x branches.

Bear in mind that perl is only a dependency because g.html2man uses
it. Personally, I see no difference between requiring perl in order to
build the manual pages and requiring python.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn mentioned DocBook as a possible future doc format...what about XML? Are there
any advantages to ever using that format, or is it pretty much the same as using HTML?

If we ever wanted to move to XML, or at least make it easier to migrate to it, all html
tags would have to be lowercased, as I believe XML expects it. I've written a small sed
script to do this already, but I didn't want to modify all 300+ docs and do a massive
svn commit if there's no point.

~ Eric.

Hi,

2008/2/26, Patton, Eric <epatton@nrcan.gc.ca>:

Glynn mentioned DocBook as a possible future doc format...what about XML? Are there
any advantages to ever using that format, or is it pretty much the same as using HTML?

Docbook is a semantic markup language -- an XML language [1]. One of
examples can be Mplayer docs system [2], it could be good inspiration
for Docbook-based GRASS manual pages, multi-language based.

If we ever wanted to move to XML, or at least make it easier to migrate to it, all html
tags would have to be lowercased, as I believe XML expects it. I've written a small sed
script to do this already, but I didn't want to modify all 300+ docs and do a massive
svn commit if there's no point.

Moving to Docbook would be advantage, if we do not break K.I.S.S.
principle. Another issue for GRASS7.

Martin

[1] http://en.wikipedia.org/wiki/Docbook
[2] http://svn.mplayerhq.hu/mplayer/trunk/DOCS/

--
Martin Landa <landa.martin gmail.com> * http://gama.fsv.cvut.cz/~landa *

On Tuesday 26 February 2008, Patton, Eric wrote:

Glynn mentioned DocBook as a possible future doc format...what about XML?
Are there any advantages to ever using that format, or is it pretty much
the same as using HTML?

This was the kind of thing I had in mind when originally mentioning CSS
block-level classes. I suppose that I should read about how PDFdocs are
created by the Makefile-- but it would be nice to have the man pages in the
most maleable format possible.

If we ever wanted to move to XML, or at least make it easier to migrate to
it, all html tags would have to be lowercased, as I believe XML expects it.
I've written a small sed script to do this already, but I didn't want to
modify all 300+ docs and do a massive svn commit if there's no point.

DocBook, custom XML, or even some kind of LaTeX hybrid (like the R manual
system) might be useful. Moving thing between HTML and Man page format would
be another story-- but probably doable with some kind of simple
parser/converter.

Dylan

~ Eric.

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

--
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341

On Monday 25 February 2008, Glynn Clements wrote:

Dylan Beaudette wrote:
> I wonder if now would be a good time to investgate the use of CSS in the
> man pages. If we define a couple types of "container" objects (<div>,
> <span>, etc) we can use a single style file to later manipulate the look
> and feel of the manual pages.

If you're going to overhaul the documentation, I suggest going all the
way and using something which is intended to be used as a source for
multiple formats (at least HTML and nroff, with one or more of TeX,
PDF and PostScript as options), e.g. DocBook.

Right-- this was the thought, although block-level CSS seemed like a middle
ground.

I am not familiar with DocBook, but here is a good start:
http://en.wikipedia.org/wiki/DocBook

There is a Debian package called 'docbook-defguide' which looks like it
contains much good information, saved (on my system) here:
/usr/share/doc/docbook-defguide/html/docbook.html

It would be nice to have the option of converting the base manual into one's
favorite format: Man pages, HTML, LateX, PDF, etc.

--
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341

DocBook has been considered for OSGeo edu material so there has been
quite a bit of discussion on that - this is what Frank had to say:

On the whole DocBook issue - we tried using DocBook for a while for
MapServer
docs and ended up abandoning it because installing and getting to understand
DocBook tools was too hard for many potential contributors. It also turned
out to be a clumsy format to work in. Perhaps things have improved, or
we mapserverites were particularly dumb - but take that at least as a mild
cautionary tale. We ended up with documents written in html, and restructured
text in plone though we aren't so thrilled with that either. There is
some consideration being given to just moving to a Trac wiki (though Trac
wiki is particular weak as a wiki in my opinion).

here is the discussion:
http://lists.osgeo.org/pipermail/edu_discuss/2008-January/thread.html

I tried DocBook and you just have to learn and get used to a new thing
and it has its own complexities and I am not sure it is worth it.

And BTW I am one those people who find having the old fashioned man pages
on hand useful - I work a lot from home and it was much faster to view
the simwe man pages that I was modifying using the old format than waiting
for them to pop-up in remotely run web browser or move them around.
I would also like to suggest keeping the man pages simple and easy to maintain,
the more complex it gets, the fewer people will be able to maintain it and
the more complex the task will become.

Helena

On Tue, 2008-02-26 at 09:31 -0800, Dylan Beaudette wrote:

On Monday 25 February 2008, Glynn Clements wrote:
> Dylan Beaudette wrote:
> > I wonder if now would be a good time to investgate the use of CSS in the
> > man pages. If we define a couple types of "container" objects (<div>,
> > <span>, etc) we can use a single style file to later manipulate the look
> > and feel of the manual pages.
>
> If you're going to overhaul the documentation, I suggest going all the
> way and using something which is intended to be used as a source for
> multiple formats (at least HTML and nroff, with one or more of TeX,
> PDF and PostScript as options), e.g. DocBook.

Right-- this was the thought, although block-level CSS seemed like a middle
ground.

I am not familiar with DocBook, but here is a good start:
http://en.wikipedia.org/wiki/DocBook

There is a Debian package called 'docbook-defguide' which looks like it
contains much good information, saved (on my system) here:
/usr/share/doc/docbook-defguide/html/docbook.html

It would be nice to have the option of converting the base manual into one's
favorite format: Man pages, HTML, LateX, PDF, etc.

Hi,

I do not know, if anybody already excluded LaTeX for better
documentation output, but this format has it's potential as well...

Jachym

Helena Mitasova píše v Út 26. 02. 2008 v 12:54 -0500:

DocBook has been considered for OSGeo edu material so there has been
quite a bit of discussion on that - this is what Frank had to say:

On the whole DocBook issue - we tried using DocBook for a while for
MapServer
docs and ended up abandoning it because installing and getting to understand
DocBook tools was too hard for many potential contributors. It also turned
out to be a clumsy format to work in. Perhaps things have improved, or
we mapserverites were particularly dumb - but take that at least as a mild
cautionary tale. We ended up with documents written in html, and restructured
text in plone though we aren't so thrilled with that either. There is
some consideration being given to just moving to a Trac wiki (though Trac
wiki is particular weak as a wiki in my opinion).

here is the discussion:
http://lists.osgeo.org/pipermail/edu_discuss/2008-January/thread.html

I tried DocBook and you just have to learn and get used to a new thing
and it has its own complexities and I am not sure it is worth it.

And BTW I am one those people who find having the old fashioned man pages
on hand useful - I work a lot from home and it was much faster to view
the simwe man pages that I was modifying using the old format than waiting
for them to pop-up in remotely run web browser or move them around.
I would also like to suggest keeping the man pages simple and easy to maintain,
the more complex it gets, the fewer people will be able to maintain it and
the more complex the task will become.

Helena

On Tue, 2008-02-26 at 09:31 -0800, Dylan Beaudette wrote:
> On Monday 25 February 2008, Glynn Clements wrote:
> > Dylan Beaudette wrote:
> > > I wonder if now would be a good time to investgate the use of CSS in the
> > > man pages. If we define a couple types of "container" objects (<div>,
> > > <span>, etc) we can use a single style file to later manipulate the look
> > > and feel of the manual pages.
> >
> > If you're going to overhaul the documentation, I suggest going all the
> > way and using something which is intended to be used as a source for
> > multiple formats (at least HTML and nroff, with one or more of TeX,
> > PDF and PostScript as options), e.g. DocBook.
>
> Right-- this was the thought, although block-level CSS seemed like a middle
> ground.
>
> I am not familiar with DocBook, but here is a good start:
> http://en.wikipedia.org/wiki/DocBook
>
> There is a Debian package called 'docbook-defguide' which looks like it
> contains much good information, saved (on my system) here:
> /usr/share/doc/docbook-defguide/html/docbook.html
>
> It would be nice to have the option of converting the base manual into one's
> favorite format: Man pages, HTML, LateX, PDF, etc.
>

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

--
Jachym Cepicky
e-mail: jachym.cepicky gmail com
URL: http://les-ejk.cz
GPG: http://www.les-ejk.cz/pgp/jachym_cepicky-gpg.pub

Hi,

Dylan:

DocBook, custom XML, or even some kind of LaTeX hybrid (like the R
manual system) might be useful. Moving thing between HTML and Man
page format would be another story-- but probably doable with some
kind of simple parser/converter.

HTML is Hyper *TEXT Markup Language*. XML is anything Markup Language.
HTML is clearer & native for our need, and much more well known.

What we need is a text markup language and that's exactly what we've
got, I don't see any point in moving away from it. As it is a
structured text markup language there are many tools to cleanly convert
it to other document formats. We need a clear text markup language with
access to links, and that's exactly what HTML provides. I've never felt
limited by it.

If there's a problem with the help pages it has to do with out of date
content, not the markup structure. And Eric has stepped up to help
tackle the out-of-date problem. Perhaps some dead-link-checking tool
could be helpful to highlight SEE ALSOs to unported GRASS 5 modules?

It would be nice to have the option of converting the base manual
into one's favorite format: Man pages, HTML, LateX, PDF, etc.

we can already do that.

as discussed, the default `make` creates man pages

to get PDF versions just run:
make html2pdfdoc
   or
make html2pdfdoccomplete

The above two require the htmldoc program (-> PS, PDF)
If LaTeX is wanted, there is gnuhtml2latex, LyX, probably many others.

In Frank's message fwd'd by Helena, he mentions reStructured text.
Perhaps good for writing a book but not for help pages IMO. (I used it
in a script to create the PDF book version of
galleryofmapprojections.com; but still needed to hack in raw LaTeX to
the result to get what I wanted)

The current issue with g.html2man is just a tiny coding bug, easily
fixed. The perl dependency and the brittleness of it are not nice, but
99% of the description.html files do not use advanced tags and so it
suffices. Also it is already written and tested, which counts for a
lot.

FWIW:
$ apt-cache search html2
gnuhtml2latex - A Perl script that converts html files to latex
html2ps - HTML to PostScript converter
html2text - An advanced HTML to text converter
html2wml - converts HTML pages to WML (WAP) or i-mode pages
libgtkhtml2-0 - HTML rendering/editing library (for GNOME2)
libgtkhtml2-dev - HTML rendering/editing library for (GNOME2)
libgtkhtml2-ruby - GtkHTML bindings for the Ruby language
lyx - High Level Word Processor
stx2any - Converter from structured plain text to other formats
sylpheed-claws-gtk2-html2-viewer - HTML mail/attachment viewer for
Sylpheed-Claws GTK2 mailer
xhtml2ps - HTML to PostScript converter (Tcl/Tk GUI frontend)

2c,
Hamish

      ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

This was also discussed at OSGeo Edu and I am so far sticking with LaTeX
for the lecture notes and the GRASSbook is written in Latex too.
I am not sure about its suitability for manpages -
many people are scared of its complexity although there is an easy
to use text editor for it (see below) and I don't find it complex at
all, as long as Markus does the layout and formatting :slight_smile:

Helena

http://lists.osgeo.org/pipermail/edu_discuss/2008-January/000682.html
http://www.lyx.org

On Tue, 2008-02-26 at 20:14 +0100, Jachym Cepicky wrote:

Hi,

I do not know, if anybody already excluded LaTeX for better
documentation output, but this format has it's potential as well...

Jachym

Helena Mitasova píše v Út 26. 02. 2008 v 12:54 -0500:
> DocBook has been considered for OSGeo edu material so there has been
> quite a bit of discussion on that - this is what Frank had to say:
>
> On the whole DocBook issue - we tried using DocBook for a while for
> MapServer
> docs and ended up abandoning it because installing and getting to understand
> DocBook tools was too hard for many potential contributors. It also turned
> out to be a clumsy format to work in. Perhaps things have improved, or
> we mapserverites were particularly dumb - but take that at least as a mild
> cautionary tale. We ended up with documents written in html, and restructured
> text in plone though we aren't so thrilled with that either. There is
> some consideration being given to just moving to a Trac wiki (though Trac
> wiki is particular weak as a wiki in my opinion).
>
> here is the discussion:
> http://lists.osgeo.org/pipermail/edu_discuss/2008-January/thread.html
>
> I tried DocBook and you just have to learn and get used to a new thing
> and it has its own complexities and I am not sure it is worth it.
>
> And BTW I am one those people who find having the old fashioned man pages
> on hand useful - I work a lot from home and it was much faster to view
> the simwe man pages that I was modifying using the old format than waiting
> for them to pop-up in remotely run web browser or move them around.
> I would also like to suggest keeping the man pages simple and easy to maintain,
> the more complex it gets, the fewer people will be able to maintain it and
> the more complex the task will become.
>
> Helena
>
>
> On Tue, 2008-02-26 at 09:31 -0800, Dylan Beaudette wrote:
> > On Monday 25 February 2008, Glynn Clements wrote:
> > > Dylan Beaudette wrote:
> > > > I wonder if now would be a good time to investgate the use of CSS in the
> > > > man pages. If we define a couple types of "container" objects (<div>,
> > > > <span>, etc) we can use a single style file to later manipulate the look
> > > > and feel of the manual pages.
> > >
> > > If you're going to overhaul the documentation, I suggest going all the
> > > way and using something which is intended to be used as a source for
> > > multiple formats (at least HTML and nroff, with one or more of TeX,
> > > PDF and PostScript as options), e.g. DocBook.
> >
> > Right-- this was the thought, although block-level CSS seemed like a middle
> > ground.
> >
> > I am not familiar with DocBook, but here is a good start:
> > http://en.wikipedia.org/wiki/DocBook
> >
> > There is a Debian package called 'docbook-defguide' which looks like it
> > contains much good information, saved (on my system) here:
> > /usr/share/doc/docbook-defguide/html/docbook.html
> >
> > It would be nice to have the option of converting the base manual into one's
> > favorite format: Man pages, HTML, LateX, PDF, etc.
> >
>
> _______________________________________________
> grass-dev mailing list
> grass-dev@lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-dev

On Tuesday 26 February 2008, Helena Mitasova wrote:

DocBook has been considered for OSGeo edu material so there has been
quite a bit of discussion on that - this is what Frank had to say:

On the whole DocBook issue - we tried using DocBook for a while for
MapServer
docs and ended up abandoning it because installing and getting to
understand DocBook tools was too hard for many potential contributors. It
also turned out to be a clumsy format to work in. Perhaps things have
improved, or we mapserverites were particularly dumb - but take that at
least as a mild cautionary tale. We ended up with documents written in
html, and restructured text in plone though we aren't so thrilled with that
either. There is some consideration being given to just moving to a Trac
wiki (though Trac wiki is particular weak as a wiki in my opinion).

here is the discussion:
http://lists.osgeo.org/pipermail/edu_discuss/2008-January/thread.html

I tried DocBook and you just have to learn and get used to a new thing
and it has its own complexities and I am not sure it is worth it.

And BTW I am one those people who find having the old fashioned man pages
on hand useful - I work a lot from home and it was much faster to view
the simwe man pages that I was modifying using the old format than waiting
for them to pop-up in remotely run web browser or move them around.
I would also like to suggest keeping the man pages simple and easy to
maintain, the more complex it gets, the fewer people will be able to
maintain it and the more complex the task will become.

Helena

This brings up a good question-- authoring. Are the manual pages going to be
*mostly* generated on the fly from keywords and a template, as they are now,
or do others envision some other approach. As it stands the boilerplate HTML
document that accompanies each module isn't usually that complex, so
authoring/editing these things wouldn't be too difficult (if a format like
DocBook were to be used).

The complications would probably be primarily associated with writing new
material, or longer how-to documents (like much of the material on the
Mapserver page).

I personally prefer LaTeX for documentation writing, but this may introduce
too much overhead for users interested only in HTML documents. There are
well-established tools to accomplish this, but not all users will have a TeX
install. I noticed a tool called Latex2Man [1] which could simplify man page
generation from a 'core' documentation set written in LaTeX.

1. http://ctan.tug.org/tex-archive/support/latex2man/latex2man.html

Dylan

On Tue, 2008-02-26 at 09:31 -0800, Dylan Beaudette wrote:
> On Monday 25 February 2008, Glynn Clements wrote:
> > Dylan Beaudette wrote:
> > > I wonder if now would be a good time to investgate the use of CSS in
> > > the man pages. If we define a couple types of "container" objects
> > > (<div>, <span>, etc) we can use a single style file to later
> > > manipulate the look and feel of the manual pages.
> >
> > If you're going to overhaul the documentation, I suggest going all the
> > way and using something which is intended to be used as a source for
> > multiple formats (at least HTML and nroff, with one or more of TeX,
> > PDF and PostScript as options), e.g. DocBook.
>
> Right-- this was the thought, although block-level CSS seemed like a
> middle ground.
>
> I am not familiar with DocBook, but here is a good start:
> http://en.wikipedia.org/wiki/DocBook
>
> There is a Debian package called 'docbook-defguide' which looks like it
> contains much good information, saved (on my system) here:
> /usr/share/doc/docbook-defguide/html/docbook.html
>
> It would be nice to have the option of converting the base manual into
> one's favorite format: Man pages, HTML, LateX, PDF, etc.

--
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341

On Tuesday 26 February 2008 12:04:10 pm Hamish wrote:

Hi,

Dylan:
> DocBook, custom XML, or even some kind of LaTeX hybrid (like the R
> manual system) might be useful. Moving thing between HTML and Man
> page format would be another story-- but probably doable with some
> kind of simple parser/converter.

HTML is Hyper *TEXT Markup Language*. XML is anything Markup Language.
HTML is clearer & native for our need, and much more well known.

Right-- HTML works well and is simple to author / maintain.

What we need is a text markup language and that's exactly what we've
got, I don't see any point in moving away from it. As it is a
structured text markup language there are many tools to cleanly convert
it to other document formats. We need a clear text markup language with
access to links, and that's exactly what HTML provides. I've never felt
limited by it.

I agree 100%. I was just throwing out some ideas. Perhaps I became a little
too interested in adding structure / complexity were it is not needed.

If there's a problem with the help pages it has to do with out of date
content, not the markup structure. And Eric has stepped up to help
tackle the out-of-date problem. Perhaps some dead-link-checking tool
could be helpful to highlight SEE ALSOs to unported GRASS 5 modules?

> It would be nice to have the option of converting the base manual
> into one's favorite format: Man pages, HTML, LateX, PDF, etc.

we can already do that.

I should have been more clear: I meant that (if we were to switch to some
other markup/encoding of the core documentation) ... it would be nice to
convert that into Man pages.

as discussed, the default `make` creates man pages

to get PDF versions just run:
make html2pdfdoc
   or
make html2pdfdoccomplete

The above two require the htmldoc program (-> PS, PDF)
If LaTeX is wanted, there is gnuhtml2latex, LyX, probably many others.

Sure. Again, I was throwing out some ideas.

In Frank's message fwd'd by Helena, he mentions reStructured text.
Perhaps good for writing a book but not for help pages IMO. (I used it
in a script to create the PDF book version of
galleryofmapprojections.com; but still needed to hack in raw LaTeX to
the result to get what I wanted)

The current issue with g.html2man is just a tiny coding bug, easily
fixed. The perl dependency and the brittleness of it are not nice, but
99% of the description.html files do not use advanced tags and so it
suffices. Also it is already written and tested, which counts for a
lot.

OK. What I had in mind when posting some of these last messages was some kind
of base format (machine readable) in which the docs were stored / created at
compile time, such that conversion to HTML, Man pages etc. could be improved
by modern text processing (XSLT for XML, latex2man for latex, etc)
approaches.

However, if the docs are kept simple then the existing perl scrip should do
the job.

FWIW:
$ apt-cache search html2
gnuhtml2latex - A Perl script that converts html files to latex
html2ps - HTML to PostScript converter
html2text - An advanced HTML to text converter
html2wml - converts HTML pages to WML (WAP) or i-mode pages
libgtkhtml2-0 - HTML rendering/editing library (for GNOME2)
libgtkhtml2-dev - HTML rendering/editing library for (GNOME2)
libgtkhtml2-ruby - GtkHTML bindings for the Ruby language
lyx - High Level Word Processor
stx2any - Converter from structured plain text to other formats
sylpheed-claws-gtk2-html2-viewer - HTML mail/attachment viewer for
Sylpheed-Claws GTK2 mailer
xhtml2ps - HTML to PostScript converter (Tcl/Tk GUI frontend)

2c,
Hamish

Cheers,
Dylan

--
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341

Hamish wrote:

> DocBook, custom XML, or even some kind of LaTeX hybrid (like the R
> manual system) might be useful. Moving thing between HTML and Man
> page format would be another story-- but probably doable with some
> kind of simple parser/converter.

HTML is Hyper *TEXT Markup Language*. XML is anything Markup Language.
HTML is clearer & native for our need, and much more well known.

What we need is a text markup language and that's exactly what we've
got, I don't see any point in moving away from it.

HTML may be a text markup language, but it isn't a very good one. It
provides far too many features which are intended to directly control
appearance. That makes it unsuitable for conversion to other formats
which don't have exactly the same display model as HTML.

In short, it's much easier to generate HTML than it is to convert HTML
to another format.

If there's a problem with the help pages it has to do with out of date
content, not the markup structure.

No, there's a real problem with people assuming that the files are
just normal HTML files. They aren't normal HTML files; they're
g.html2man source files. The two aren't the same thing.

The current issue with g.html2man is just a tiny coding bug, easily
fixed.

Fixing the "tiny coding bug" only fixes the symptoms; fixing the real
problem is harder. The real problem is that the format of those files
isn't documented anywhere. Generic HTML documentation doesn't help,
because g.html2man doesn't understand arbitrary HTML, nor will any
replacement.

If you're going to settle on some subset of HTML as the source
language, you need to specify exactly which subset that is.

--
Glynn Clements <glynn@gclements.plus.com>

Dylan Beaudette wrote:

I personally prefer LaTeX for documentation writing, but this may introduce
too much overhead for users interested only in HTML documents. There are
well-established tools to accomplish this, but not all users will have a TeX
install. I noticed a tool called Latex2Man [1] which could simplify man page
generation from a 'core' documentation set written in LaTeX.

TeX has much the same problem as HTML: if you need to generate any
kind of restricted format, you have to restrict usage to a subset
which can be accurately converted to all supported target formats.

E.g. complex equations may come out fine if you generate PostScript or
DVI output, but may be completely unintelligible when converted to
nroff and displayed on a terminal.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn:

The real problem is that the format of those files
isn't documented anywhere. Generic HTML documentation doesn't help,
because g.html2man doesn't understand arbitrary HTML, nor will any
replacement.

If you're going to settle on some subset of HTML as the source
language, you need to specify exactly which subset that is.

http://trac.osgeo.org/grass/browser/grass/trunk/tools/g.html2man/htmltags.txt

Hamish

      ____________________________________________________________________________________
Never miss a thing. Make Yahoo your home page.
http://www.yahoo.com/r/hs

Hamish wrote:

> The real problem is that the format of those files
> isn't documented anywhere. Generic HTML documentation doesn't help,
> because g.html2man doesn't understand arbitrary HTML, nor will any
> replacement.
>
> If you're going to settle on some subset of HTML as the source
> language, you need to specify exactly which subset that is.

http://trac.osgeo.org/grass/browser/grass/trunk/tools/g.html2man/htmltags.txt

It isn't just an issue of which tags, but the syntax of those tags.
E.g. HTML allows line breaks in the middle of a tag, and it allows
literal < and > characters in an attribute value. g.html2man doesn't
cope with either of those (most ad-hoc parsers don't).

There's also the issue of semantics, e.g. how those tags get
translated to nroff, how to link to related documentation, etc.

--
Glynn Clements <glynn@gclements.plus.com>

On Wednesday 27 February 2008, Glynn Clements wrote:

Dylan Beaudette wrote:
> I personally prefer LaTeX for documentation writing, but this may
> introduce too much overhead for users interested only in HTML documents.
> There are well-established tools to accomplish this, but not all users
> will have a TeX install. I noticed a tool called Latex2Man [1] which
> could simplify man page generation from a 'core' documentation set
> written in LaTeX.

TeX has much the same problem as HTML: if you need to generate any
kind of restricted format, you have to restrict usage to a subset
which can be accurately converted to all supported target formats.

E.g. complex equations may come out fine if you generate PostScript or
DVI output, but may be completely unintelligible when converted to
nroff and displayed on a terminal.

Good point.

It seems like some variant of well-defined XML would be the most flexible in
terms of storing the documentation. User-friendly HTML/Man pages would then
be generated from the XML.

I like the idea, but given the amount of resistance-- I am not sure that it
would be well received and therefore not implemented.

Dylan

--
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341