Re: Image attachments to ASCII RFCs (was: Re: Last Call: 'Proposed Experiment: Normative Format in Addition to ASCII Text' to Experimental RFC (draft-ash-alt-formats))

Iljitsch van Beijnum <iljitsch@xxxxxxxxx> · Tue, 20 Jun 2006 00:18:08 +0200

On 19-jun-2006, at 20:09, John C Klensin wrote:

(2) If I prepare an RFC draft using some mechanism which
produces a document in form X, where X might include

	* ASCII text, via emacs or vi, with a post processor for
headers, footers, or page numbers
	* xml2rfc
	* MSWord plus some templates and post processing
	* nroff with a profile different from the RFC Editor's standard
	* LaTeX or TeX
	* Obscure word processor 7b

then the RFC Editor makes changes, possibly extensive and
returns the revised document in an ASCII text format.

Now I have never written an RFC so I don't know how all of this  
works, but I have written two books for different publishers so I  
know how this kind of thing works elsewhere. My question: does the  
RFC editor really live up to his/her name and perform extensive  
edits? And if so, what are the nature of those edits? I can't imagine  
they go to the technical content, so either it's language (copy edit)  
or formatting, right?

No matter how good my tools are, I'm going to have to do
considerable, mostly manual, work to retrofit those changes into
my source.

Ok. If we're talking about copy edits here, then this is one for the  
problem statement. Note by the way that the formatting authors are  
normally expected to do is indicating heading levels, block quotes,  
captures, that kind of thing. This is very easy to do with styles in  
modern word processing applications, which generally make anything  
tagged with a style look different. It's also easy to do with old  
school tools such as nroff and HTML/XML, and unless I'm mistaken,  
there are tools available to convert between the two approaches.  
(Word processors generally have a document format in common that  
preserves the style tags if not their attributes.)

If there is an iterative cycle of changes between the authors and the  
RFC editor, I think it's necessary that this is done with some form  
of style tagging. In addition, more advanced word processors such as  
Word and Star/Open Office/Writer support a "track changes" feature  
where any changes made to a document are identified along with who  
made them. This makes sending different versions of a document back  
and forth infinitely easier, but it has the downside that many word  
processors and other tools don't support this, so the selection in  
tools is much smaller.

But, if it is not, then one of the discussions we should be
having, IMO, is about selecting one or two additional input
formats (xml2rfc and Word stand out as candidates to me) in
which the RFC Editor is willing to accept input documents and
return editing results to the authors for review.

Yes, this makes sense. Especially if the Word format is only used as  
a "lingua franca" without depending on specific details. I.e., I can  
write a text in Open Office, tag it with the right styles and export  
to .doc format. In Word or another word processor that can read those  
files, the styles will still be present even if the document looks  
slightly different from the way it did when I wrote it because some  
information is lost when saving in a non-native file format.

Doing so
would solve a large number of problems, not only wrt easily
producing PS/PDF forms when needed, but for producing revised
versions of the relevant documents for updated standards, etc.
That suggestion has been made several times; as far as I know,
the discussion has never gotten off the ground.

Let's see if we can do better now.

(3) Finally, if one adopts the "plates in the back" model, the
PDF (or whatever) document contains the illustrations and _only_
the illustrations.  That makes it fairly insensitive to the RFC
editing process.

I'm not too happy about that. If we're going the route of using  
images for formulas and drawings, it makes sense to make those  
available as separate files in an "easy" format such as GIF or PNG,  
or, in addition or alternatively, a PDF can be produced where the  
images are present in-line with the text. Having the images separate  
from the text in a PDF has all the disadvantages of PDF: possibly  
large files, small selection in viewers, security and compatibility  
issues, but it's still inconvenient when reading the text and having  
to hunt for the images separately.

So I suggest that, generally, we would find
that, in a "text plus figures" model, the editing process would
generally change the text only, permitting the figures/images to
remain unchanged.

That is not my experience. Figures are very hard to get right, they  
very often require edits.

As I have said in response to other notes, I'm not convinced
that "text plus figures" (or "image attachments") is a good
enough idea to be worth writing up and considering -- in terms
of either IETF needs or RFC Editor workload.  But it has enough
advantages relative to either version of the status quo
--"ASCII-only" or "of course you can produce a version in PDF if
you are motivated enough and we hope you won't get motivated
very often"-- that I think it needs a bit more consideration
than it has been getting.

Two prominent problems associated with the ASCII format are that it  
doesn't really support formulas and figures. I was intrigued with the  
earlier Unicode examples, so I decided to do some checking of my own  
with regard to "unicode art" for figures. Have a look at:

http://www.muada.com/drafts/utf8-art.txt
http://www.muada.com/drafts/utf16-art.txt

I think this is closer to what an RFC with Unicode line art would  
look like than trying to present an example in email. For me, the  
UTF-8 encoding isn't immediately decoded properly by my browser, but  
the UTF-16 version is. I also can't get this displayed properly on  
the command line on my Mac. Still, it's not _too_ hard to have the  
Unicode characters displayed properly. The Unicode line art looks a  
lot better than ASCII-only line art, but it shares many of the same  
limitations, such as only (reasonably) being able to display  
rectangular shapes and horizontal/vertical lines. There are some  
exceptions such as the ability to use rounded corners, but true round  
shapes or even usable diagonal lines don't seem to be supported. This  
also means that it should be generally possible to convert from  
Unicode to ASCII-only line drawings without much loss of information.

There has been some talk about specifying a font for displaying  
Unicode, but on my Mac at least, that doesn't seem to be necessary.  
An important issue with different fonts is the difference in  
character width for different characters, but the line art characters  
are mostly the same width so this isn't an issue. However, the width  
of the space character can vary, but there's probably a fixed width  
space in the table somewhere. Also, it looks like there is only a  
single glyph for these types of characters that is shared between  
fonts. I.e., whether I use Courier or Times, the line drawing  
characters look the same.

It does seem to me that looking into Unicode for better formula and  
drawing support makes a lot of sense. This allows us to make better  
looking RFCs without radically changing the way RFCs are published.

However, I think we probably want to change the process for other  
reasons. I think it would be very useful to have the "source" of an  
RFC available with style tagging and so on in order to more easily  
derive future work. It's probably also a good idea to have "blessed"  
PDFs or some similar format for pretty printing, especially for the  
RFCs that contain formulas and drawings. And we may want to make  
those formulas and drawings available as simple bitmap images so they  
can be easily viewed on systems with limited capabilities. But with  
all of that in place, it's probably a good idea to keep having the  
ASCII version of RFCs be the normative version.

Anyway, that's my $0.02 Canadian on this subject.

_______________________________________________

Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf