Re: recent website changes for docs

Karsten Wade <kwade@xxxxxxxxxx> · 27 Feb 2004 11:54:17 -0800

On Fri, 2004-02-27 at 11:15, Dave Pawson wrote:
> My only other suggestion is to chunk the source into tiny bits,
> then use a plain text to xml program,
> and chunk the big bits via some other progam into article or somesuch.

My I suggest html2db:

http://www.cise.ufl.edu/~ppadala/tidy/

It does a very nice, compliant conversion.  Converting from HTML, it
can't know much more than to turn <pre /> into <literal />, but that
kind of thing is easy to fix.  I've converted multi-page HTML into
DocBook *ML in just a few hours with a simple convert and edit.  The
structure you get in the end is not the point, it's the chunks of markup
which can then be put into a DocBook template.

Of course, it's nice to have an editor that can do e.g. sgml-tag-region
and tag creation/completion for manually marking up missed or incorrect
bits.

- Karsten
-- 
Karsten Wade   :      Tech Writer, RHCE     :  o: +1.831.466.9664
kwade@xxxxxxxxxx : http://rhea.redhat.com/ :   c: +1.831.818.9995
         Red Hat Applications : WAF, CMS, Portal Server         
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --