Re: [Tools-discuss] messaging formatting follies, was The IETF's email

Keith Moore <moore@xxxxxxxxxxxxxxxxxxxx> · Sun, 27 Aug 2023 22:06:11 -0400

    On 8/27/23 14:35, John C Klensin wrote: 

        If, say, a list, or a recipient, wants to enforce a subset or
profile, it can do so by implementing a filter that removes
elements that don't conform to the profile.

      Of course.  But removing elements risks changing the meaning of
the message or at least of making it hard to understand. 

    Give me a little bit of credit.   Yes, removing elements potentially
    risks changing the meaning of the message, or at least one can find
    contrived examples where this could happen.  But we're not likely to
    remove elements that convey important visual information and
    otherwise leave the message intact.

    For that matter MUAs routinely add elements that the message author
    did not specify, often screwing up the formatting of the message,
    and also potentially changing the meaning.  But somehow most of us
    are okay with that.

       Of
course, it would also invalidate any sort of message integrity
checks (signatures or otherwise) on the original message,
something we might or might not care about.

    Maybe IETF lists don't do this, but I've seen lots of messages
    mangled by mailing lists in such a way as to invalidate any
    signature or integrity check.   If we want to assure integrity
    and/or non-repudiation for contributions to IETF mailing lists, we
    can archive those messages before sending them to the list
    processor.   Assuming, of course, that the archiver doesn't
    invalidate the signature.

        A list can, if it wishes, also bounce email that doesn't
conform to the profile, or that egregiously fails to conform
to the profile.

      Sure.  But if the IETF were to adopt a specific profile and
enforce it that way, I can see a few possible consequences:

(1) If users compose mail in an MUA that provides good HTML
support, accidental mistakes that would cause mail to be bounced
or blocked would probably be inevitable.  Independent of what
that would do to the email traffic and delays in getting
messages through, it would seem to me to be a good way to
discourage participation in the IETF by all but the most
dedicated.

    I expect that some judicious choices would need to be made for:

    - which html elements were permissible,

    - which html elements were not permissible but would be corrected
    before forwarding, and

    - which html elements would be forbidden

    Clearly, we would not want to include in the "forbidden" list any
      elements that could easily be generated by accident.
    But I also feel compelled to point out that one of the
      alternatives to "permit a subset of HTML" that's being proposed is
      "permit no HTML at all".   And in my experience it's a lot easier
      with most MUAs to accidentally send HTML (which would presumably
      cause the message to bounce) than it is to accidentally send the
      kinds of HTML that I suspect we would want to cause a message to
      bounce.

      (2) If we set the precedent of a special profile for messages
sent to our lists and enforcement of that profile as above,
others might decide we were setting a good example.  Some of
them would then either adopt and enforce our profile or, more
likely in many cases, develop their own variants on it and adopt
and enforce that.  That would amount to expecting originating
MUAs to help their users by adopting per-destination domain (or
address) validators or assistive tools.  The implications of
patterns like that to global interoperability of email should
probably be clear.

    It seems like a stretch.   MUA vendors might implement some small
    number of profiles (which basically, in effect, means that they'll
    convert from the kind of HTML they use to represent the message
    being authored, to the IETF-approved subset - doing the conversion
    in the MUA in a product-specific fashion rather than doing the
    conversion in the list processor.  Offhand, I suspect that doing the
    conversion in the list processor is the preferable option.

      (3) We could do the obvious, define our special profile with a
new media type of text/ietf (or maybe text/html+ietf or
text/html-email) and then accept or bounce traffic based on
whether or not the receiving list saw that media type.   Unless
there are far more composing MUAs and submission servers
(counted by either number of systems or number of users) out
there that allow the originator to specify whatever media type
they like than I think there are, this would likely lead to a
combination of the consequences of (1) and (2): discouraging
participation in the IETF _and_ leading to (or encouraging)
general interoperability problems.

    There are lots of ways to tag HTML content, but I guess we're more
    likely to use some kind of internal tag (like a DTD) to distinguish
    our html subset from other kinds of HTML, than a MIME media type. 
    We want our html subset to be treated as HTML by the recipient's
    mail reader, not as an unknown content-type.  And part of the point
    of using HTML is to leverage the existing HTML display support in
    most mail readers.   (even if we don't like the HTML that some MUAs
    generate by default).

      Tentative conclusion #1: If one wants to restrict IETF mail
traffic to a special format, pick one that is already supported
by an existing media type, not claim something is, e.g., HTML
but then tries to require and enforce a special profile of it.
I believe that leaves either text/plain or text/richtext (see
below about the latter).

    Well, restricting to text/plain is certainly an option, and one I
      could probably live with.   I think the subset HTML option is
      worth exploring but identifying the precise subset and the
      downgrading algorithm is tricky.

      text/richtext is a nonstarter IMO, because it was never widely
      adopted, and the number of MUAs that support generating it is
      within epsilon of zero.

      We want something that's of practical use to IETF, not a purely
      academic exercise.

        If tools to do these things become widely available and become
widely used by mailing lists, there will be pressure on MUA
vendors/authors to implement that subset.

      Sure.  Do you have a realistic plan about making that happen?

    All I'm doing at the moment is suggesting some relatively
    low-complexity experiments to see what works well.  It's very
    premature to make a deployment plan.

      Before you answer, consider the likelihood that the vast
majority of Internet users don't use mailing lists for
discussions at all any more -- if they want to have a
discussion, they turn to social media and/or some flavor of real
time chat. 

    Yes, and every social media site I've seen, and
      every web-based collaboration tool, especially those based on
      "real time chat", are horribly dysfunctional and far worse than
      email in almost every respect.

    So exploring the potential to improve standard email's
      applicability as a general-purpose collaboration tool seems like a
      worthwhile investment.
    And arguments of the form "social media and/or chat are what the
      market(s) have chosen" strike me as incredibly myopic.   We can
      either at least look into making email more usable for group
      collaboration, or we can cower in fear of failure and abandon the
      Internet to these dysfunctional toys.

       Organizations and businesses, even non-spammy ones,
who use mailing lists to distribute advertising or notifications
are probably not good candidates for advocating your format
either. 

    Which is probably why I didn't suggest that.

        It's also possible that some mailing lists will develop
reputations of enforcing that subset, and that MUAs will learn
which lists enforce that subset and avoid sending
noncomformant HTML to those lists.

      Possible, yes.  It is also possible that the pony I've
periodically expressed a wish for over the years will be
delivered to my doorstep, together with a plan about how to take
care of it, tomorrow morning. 

    If I were going to wish for something, it wouldn't be a pony.  It might
    be for people to stop trying to obstruct any attempt to bring email
    into the 2020s (or even any investigation of the possibility of
    such) based on handwaving and speculation.

    (This is, IMO, a really bad habit within IETF that is all too
    common... launching DoS attacks on new ideas before they're even
    sketched out yet.  Preemptive strikes.   We're not supposed to be
    fighting a war.)

          But what you are proposing is not requiring plain text, but
some artificial HTML subset/profile.

        Well, sure.  Every new protocol is "artificial" until the
details are settled on.

        Also, this presumably isn't an arbitrary variant of HTML -
it's presumably a variant that can display and operate
correctly on existing web browsers and MUAs that implement
HTML.   And presumably the set of elements disallowed won't
be arbitrary, but will be those that fail to meet some
agreed-on criteria that are found to be reasonable for email.

      Yes.  But, again, unless that variant is identified in some
special way (and, most likely, even if it is) its being useful
will depend far more on scale than on technical merit.  And the
collection of IETF participants just isn't big enough.

    IMO identifying such variants (whether in the content-type or the
    DTD or whatever) might be nearly pointless.   The HTML subset chosen
    needs to be one that will display more-or-less consistently in all
    web browsers anyway.   And while I generally do favor tagging of
    precise variants of any data format somehow, I suspect that the
    biggest utility that would derive from such a tag would be that the
    recipient's MUA could see that the recipient is replying to an HTML
    subset message, and take that as a hint that the reply should also
    be formatted in that subset. 

        I certainly don't believe it's impossible to restrict list
traffic to plain text.   If that's what IETF Consensus
wants, I'm okay with it.   I think there are some potential
benefits to allowing some XML-like structure in list messages,
and I lean at least slightly toward that kind of solution.  
But that's not the only way to address the problems IETF is
having with email, and I also recognize that there are
potential benefits to extreme simplicity.

      Ok.  Small disclaimer: I compose email in plain text form, read
it that way when possible, and, when I need to read HTML email,
do so through a very restricted reader/interpreter that will not
automatically follow any links or otherwise process anything
external.  Part of the reason for that is perhaps that I'm some
flavor of old person who is still stuck in the 1970s (not even
the 1990s).  But more of it is a security issue: just as I will
not open a web page for while I don't have a reasonable degree
of knowledge of the party and associated authentication, I am
reluctant to open an HMTL email message without at least the
same degree of knowledge and certainty.    Now, I know the two
of you well enough that, if I received an HTML message from you
whose content was signed with a signature I could verify (not
just some assurance that mail from you might reasonably come
from a particular domain), I would have no hesitation about
opening it.  But for an IETF participants I don't know nearly as
well, or a message whose signature cannot be validated (possibly
because some list expansion software altered it to eliminate
undesirable HTML elements), I get really anxious about what such
a message might carry or do to me.

There is also a privacy issue.  I'm on several distribution
lists from whom I periodically get a "we've noticed you are not
reading our mail, is there a problem?" message.  I seem to have
the odd idea that they have no right to know whether I have
opened their messages and when.

I have no idea whether any of the reasons the Linux community
has stuck to plain text are related to those security issues,
but I would not be surprised.

    Well I'm in agreement with the privacy and security concerns.   I
    nearly always use MUAs that don't automatically load content from
    email messages, and I'm careful about clicking on links in email
    messages.   If we were to define an HTML subset for use in email
    collaboration, I expect that I'd want to exclude <IMG>
    elements from that subset (except perhaps those using data: URLs and
    *maybe* from explicitly trusted sites), <OBJECT> and other
    elements that can also reference external content, all _javascript_,
    probably all CSS or at least all external CSS, etc.  Indeed part of
    the point of that subset, at least as I imagine it, would be to
    insulate users from harmful content and email-borne spyware.

I think the following are plausible, while inventing a
(different) special dialect or profile of HTML, enforcing it,
and expecting enough others to be on board to make it really
practical is, for the reasons given, probably not:

(i) Require text/plain

    Probably technically doable, at least given the experience of the
    Linux community, though I'm not entirely sure this would fly in
    IETF.

(ii) Reexamine text/richtext or text/enriched, if necessary
updating RFC 1896.  Unlike a new profile or media type, one or
both of these is quietly supported in many systems.  It is
probably more plausible to believe that the implementers of
those systems would do a minor upgrade to keep a media type they
already support aligned with an updated standard than to believe
they would adopt a completely new format (or profile).  Allow/
encourage it instead of or as an alternative to text/plain.

    As I stated above, IMO this is a non-starter.

      (iii) Instead of fantasies about alternate formats (or profiles)
and getting them widely supported, see if we can write down, and
get agreement on, good practices for use of email on IETF lists.

    This is not mutually exclusive with the above, and probably a good
    idea anyway.

      Such practices might even include something along the lines of
"if you are going to use HTML email, you should confine yourself
to the following subset".  

    LOL.  I suspect the number of participants willing to write or edit
    the HTML in their messages on a regular basis is within epsilon of
    zero.   (I say within epsilon only because, knowing IETF, at least
    one person is likely to do it just to prove something.)

       If agreement can be reached, assume
IETF participants are adults who can and will behave responsibly
with those who don't risking having their messages ignored by
some of the community.

    Dubious assumption, IMO.

    Keith