[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Multiple HEAD elements breaking messages?



It looks like Yahoo! groups has started sending out messages containing
multiple HEAD elements in the text/html MIME subpart of its emails.

Unfortunately, this seems to break MHonArc, probably because it only
expects one and uses it to determine X-Body-of-Message et al.

I've been reading through the W3C DTD for HTML 4.01 and it's not clear
to me if HTML documents are allowed one HEAD element or more than one:

   http://www.w3.org/TR/html401/struct/global.html#edef-HEAD

If more than one is allowed, it's a bug in MHonArc.  But if it's Yahoo!'s
fault, MHonArc is in the clear and will just work around it.  ;)

Does anyone know for sure how this should be handled?  I have a patch
included below to correct this, but I'm curious if multiple HEAD tags
are legal or not.

Thanks,

Chris

P.S.  You can see the (icky) message at 
      http:///www.mallorn.com/~lindsey/multi-head-ick.txt

http://www.bonvivantnursery.com/                     Bon Vivant Nursery
http://www.hort.net/gallery/      4023 online plant photos and growing!
http://www.hort.net/gallery/date/2006-07-26/       The latest additions
diff -rc MHonArc-2.6.16/lib/mhtxthtml.pl MHonArc-2.6.16-headfix/lib/mhtxthtml.pl
*** MHonArc-2.6.16/lib/mhtxthtml.pl     Sun May  1 19:04:39 2005
--- MHonArc-2.6.16-headfix/lib/mhtxthtml.pl     Tue Nov 14 18:02:20 2006
***************
*** 186,194 ****
      $base =~ s|(.*/).*|$1|;
  
      ## Strip out certain elements/tags to support proper inclusion:
!     ## some browsers are forgiving about dublicating header tags, but
      ## we try to do things right.  It also help minimize XSS exploits.
!     $$data =~ s|<head\s*>[\s\S]*</head\s*>||io;
      1 while ($$data =~ s|<!doctype\s[^>]*>||gio);
      1 while ($$data =~ s|</?html\b[^>]*>||gio);
      1 while ($$data =~ s|</?x-html\b[^>]*>||gio);
--- 186,194 ----
      $base =~ s|(.*/).*|$1|;
  
      ## Strip out certain elements/tags to support proper inclusion:
!     ## some browsers are forgiving about duplicating header tags, but
      ## we try to do things right.  It also help minimize XSS exploits.
!     $$data =~ s|<head\s*>[\s\S]*?</head\s*>||gio;
      1 while ($$data =~ s|<!doctype\s[^>]*>||gio);
      1 while ($$data =~ s|</?html\b[^>]*>||gio);
      1 while ($$data =~ s|</?x-html\b[^>]*>||gio);


[Index of Archives]     [Bugtraq]     [Yosemite News]     [Mhonarc Home]