XML2RFC must die, was: Re: Two different threads - IETF Document Format

Iljitsch van Beijnum <iljitsch@xxxxxxxxx> · Sun, 5 Jul 2009 15:24:32 +0200

My apologies for the subject line. I'm very disappointed that the  
silent majority of draft authors isn't speaking up. I can't imagine  
that the vast majority of draft authors has absolutely no problems  
with XML2RFC. So I'm assuming they've been ignoring the thread,  
hopefully the new subject line will get some of them to chime in. If  
that doesn't happen I'll shut up and try to figure out why I have so  
much trouble with something that nobody else finds difficult.

On 4 jul 2009, at 13:27, John Levine wrote:

I think it's reasonable to assume that going forward the vast majority
of users who read online documents will be able to use software that
can reformat them in various ways.  This tells me that although the
publication form has to be readable in a pinch as plain text, it's
more important that it's amenable to mechanical processing.  Tidily
formatted xml2rfc would be a reasonable candidate

No, it's not. The problem with XML2RFC formatted drafts and RFCs is  
that you can't display them reasonably without using XML2RFC, and  
although XML2RFC can run on many systems in theory, in practice it's  
very difficult to install and run successfully because it's written in  
TCL and many XML2RFC files depend on the local availability of  
references. When those aren't present the conversion fails.

The philosophy behind XML2RFC is to encode meaning in the XML wherever  
possible, rather than simply display text. There are several problems  
with that:

1. It makes it hard to write source files, because now rather than  
type "Experimental" at the top of the file, I have to know what  
XML2RFC looks for to determine the draft's status. Same thing with  
boilerplate, references, etc.

2. It makes it hard to read source files for the same reason. You  
can't read an XML2RFC formatted XML file without prior knowledge and  
get all the information that would be displayed in the final draft/RFC  
format.

3. It gets it wrong. XML2RFC "knows" that you create a name from an  
initial, a period, a space and a last name. So initial "I" and last  
name "Van Beijnum" becomes "I. Van Beijnum". However, XML2RFC doesn't  
know that in Dutch, certain last name prefixes are capitalized if they  
appear at the beginning of the name (Van Beijnum) but not if they're  
in the middle because there are first names or initials: "I. van  
Beijnum".

This means that the makers of XML2RFC spent a lot of time making the  
tool require the authors to spend a lot of time to create something  
that is sometimes incorrect, with no means to correct the problem. An  
all-around waste of time.

Then there is the problem with XML in general. Now apparently there  
are XML editors that can make sure you create syntactically correct  
XML without having to take care of all the details manually. But as  
someone who has otherwise no need to write XML, I'm not familiar with  
those tools. So I write my XML2RFC source by hand. The result is that  
I invariably get error messages that the <section> and </section> tags  
don't match properly. This is a problem that is extremely hard to  
debug manually, especially as just grepping for "section" isn't  
enough: there could be a , </middle> etc somewhere between a  
<section> and </section> that breaks everything.

First writing a source file and then compiling it into an output file  
is no longer something something that is familiar to most people. When  
I write anything other than a draft, I can simply select "header level  
2" and I know that everything will be taken care of. I don't have to  
explicitly tell my word processor where the text following a header  
level 2 ends, because the presence of another header makes that clear  
both to me and to the software.

What we need is the ability to write drafts with a standard issue word  
processor. I'm sure that sentence conjured up nightmares of Word  
documents with insane formatting being mailed around clueless  
beaurocracies, but that's not what I mean. Word processors use styles  
to tag headings, text, quotes, lists and so on: the exact same stuff  
that you can do in XML but rather than having to think about it  
(especially closing all tags correctly) it happens easily,  
automatically and without getting in the way. (I can even change the  
style for an entire paragraph with a single menu selection or function  
key without having to find the beginnings and ends of that paragraph.)

Formatting is then based on the style tags, with all explicit  
formatting aplied by the word processor removed. This is standard  
operating procedure in 99% of publishing. (The other 1% being  
scientific/engineering books where the authors send in Latex.)

All the stuff that can't be handled by styles should just be copied  
and pasted from the boilerplate, without the need for tools to know  
about the structure of these things. (At least not in the draft stage,  
perhaps this can be useful in the final stages of RFC editing.)
_______________________________________________

Ietf@xxxxxxxx
https://www.ietf.org/mailman/listinfo/ietf