Re: [Tools-discuss] formatting follies, was The IETF's email

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Mon, Aug 21, 2023 at 1:44 PM Keith Moore <moore@xxxxxxxxxxxxxxxxxxxx> wrote:

On 8/21/23 13:16, Phillip Hallam-Baker wrote:

It has occurred to me that one way to solve the issue we are having in Everything, namely a format that is essentially a subset of HTML is going to be easiest to render as HTML by using markdown as the document format.
I am also leaning toward recommending that IETF use a subset of HTML.  For IETF's purposes I don't think it's that tricky to define the subject used, but there's still a danger of a slippery slope: "Hey, HTML already supports the <FROB> tag so why can't we use it?"   But it would have the virtue that pretty much everyone's email reader would already present such messages correctly.

Hence the attraction of an essentially pointless formatting change.

It is like when we set up the clean room for the wire chamber with a huge great big hump in the room that you had to climb over to get from the dirty side to the clean.
As for input to the list, we'd have to support:

- text/plain (with or without format=flowed)
- text/html (including lots of variations produced by various MUAs, now and in the past and future also)
- multipart/alternative (text/html; text/plain) - probably produce a different multipart/alternative with both parts derived from the text/html part of the subject message - the output html being a simplified version of the input, the output text/plain derived from the simplified html.   But the real point here is that it has to be dealt with explicitly.
- and perhaps also strip out some of the input

And probably need to support markdown or something similar as a variant of text/plain, if for no other reason than to give senders of text/plain a non-ambiguous way of including ASCII art in their messages.  (yes you can use heuristics to try to extract ASCII art from text/plain, but it seems tricky to get this right.  I'd rather use markdown than heuristics.
Another argument for markdown.


And perhaps we'd need to accept markdown embedded in text/html also (since many MUAs these days will generate text/html without the sender intending it)

But I think it's doable.   The thing that bugs me most about this is  that W3C HTML is a moving target, and it's moving in a direction that is less and less amenable to this kind of processing over time (or requires that such processing be more and more sophisticated over time).

It is not just a moving target, it is a target moving in a different direction.

Back when HTTP/2 discussion started, I tried to engage and carve out a place for Web Services. And very quickly realized that we don't need Web Services support in HTTP/2, we want them completely separated and a custom protocol designed for Web Services in which the Well Known service tag is pretty much the only header.

What we can't really expect is that we can form a WG to specify this, that will debate which parts of HTML to allow, and then produce an RFC specifying acceptable HTML for the kinds of discussion that IETF has.   Instead I think we need a research group to conduct experiments with some of these mechanisms in the context of one or more technical discussions, and report on their experiences and make recommendations.

+1

The one issue I would have there is that there is a risk of being over restrictive in the content, limiting it to just the types of discussion people are familiar with.

I wrote the following back when I was at CERN, it is based on the approach in TeX and Don Knuth responded (by email!) saying it looked valid. We did not get anything of the sort supported in Web browsers for another 15 years because 'it wasn't important'.


Of course, this is also a slippery slope, why not chemical formulas as well? why not...

The justification I would give for doing math and just math is that 

1) it is the only markup that is typically used inline in text. 
2) if you can't express math in TeX, it isn't math notation any more.

For example, my thesis has a lot of very custom math markup, CSP, Z, and some custom notations. They are all handled by the TeX processor which has a very small number of very powerful rules. And in fact, it does support chemical notations.

Now a separate issue is how people would type this stuff in at the keyboard. And I rather suspect that a lot of the demand for 'plaintext' is really a demand to be able to edit messages from the keyboard.


I suspect most of us would prefer to type something more like:

/sum{n=1}{+/infinity}1/n^2

Only in an XMLish idiom.

The way it is structured in LaTeX is that you can have multiple math notations that map to the fundamental presentation widgets.

At this point, any math markup is going to have to be congruent to MathML in order to be viable. So it has to be possible to translate from the messaging format to MathML. It is also desirable to be able to round trip but this may be lossy.

The big challenge here would be adapting markdown so that quoting nested threads became reliable. The way I see it, things go pear shaped because there is an interaction between line wrapping and quoting.

So the thread

Fred said
|That is nuts, I am going to yammer on and on and on and on and on and on and on and on and on and on and on

gets wrapped as

Fred said
|That is nuts, I am going to yammer on and on and on 
and on and on and on and on and on and on and on 
and on

And the only way to recover the threading is to apply heuristics.

So one hard and fast rule must be that clients MUST NOT wrap lines. A new line is always a new paragraph.

When displaying proportional font text, new paragraphs always receive a separator space (or not) as determined by the user preference, preformatted text does not.


Oh and as for Markdown dialect, I think GitHub has essentially closed that debate.

[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Mhonarc]     [Fedora Users]

  Powered by Linux