Re: Adding Mime Types

Shawn Willden <shawn-kde@xxxxxxxxxxx> · Mon, 22 Mar 2004 13:18:09 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 14 March 2004 12:27 pm, James Henry Maiewski wrote:
> Hello,
>
> 	 I'm sorry if this is a bit off the topic at hand.
>
> 	I've been getting a bunch of spam of late which is simply a bunch of
> random words.  If I view the souce, however, there is also some html code
> in the message.  The different sections are seperated, in the source (plain
> text) mesage by something like:
>
> --=_NextPart_000_000B_75K42NA5_4604F5156
> Content-Type: text/html; charset=us-ascii
> Content-Transfer-Encoding: 8bit
>
> with this in the header:
>
> Content-Type: multipart/alternative;
>   boundary="=_NextPart_000_000B_75K42NA5_4604F5156"
>
> What's up with this?

The messages are MIME multipart messages, which have multiple sections 
separated by a separator string chosen to ensure that the message content 
doesn't contain the separator.  The sections can each have a different 
content type and encoding, so you can attach images, for example.  A mail 
reader doesn't present these sections directly, instead it should determine 
what's in each section (based on content type), decode it appropriately 
(based on encoding, which is often Base64 for binary data, like images) and 
present the pieces to the user in an appropriate fashion.

In this particular case, spammers are employing a common feature of many mail 
readers for their own nefarious purposes.  In the early days of HTML e-mail, 
many mail readers couldn't handle HTML at all, so HTML-capable readers took 
to including two copies of HTML content, one in plain text and one in HTML, 
in separate MIME parts.  An HTML-capable reader that received such a message 
could notice the HTML part and just display that, so that the user would get 
the full "benefit" of the sender's bright pink text on a purple background 
with liberal use of the <blink> tag.  On the other hand, a reader that didn't 
know anything about HTML or multipart messages would simply display the 
message "source" which contains the plain text at the top.  Slightly smarter 
non-HTML readers would recognize the multipart message and pick out the 
text/plain part to display and then offer the user various options about what 
to do with the other part(s).

Spammers are using this to attempt to defeat spam filters.  They put a bunch 
of plain text words in a text/plain section so that filters will read this 
"non-spammy" stuff and conclude that the message isn't spam.  They then put 
the real content in a text/html section, knowing that Outlook Express will 
simply display it.

KMail, of course, is an HTML-capable mail reader, but by default prefers to 
display plain text rather than HTML for security reasons (and to save the 
user from blindness induced by exposure to blinking pink-on-purple text).  
You can change this default globally or on a per-folder basis (globally with 
Settings->Configure Kmail->Security->General and per folder with the Folder 
menu).

> Does Kmail obscure the html source like it obscurs 
> the headers?

For a message that contains both text/plain and text/html parts, it will 
choose one or the other based on your settings and display that one, 
rendering the HTML if that's what you prefer.  As I said, by default it 
prefers text/plain.

For a message that contains only text/html, KMail will display the raw HTML if 
you prefer text/plain, with a link to render the HTML.  If your preferences 
say to prefer HTML, it will just render the HTML.

Also, note that KMail "obscures" the headers by default but you can change 
that if you like.  Try setting "View->Headers->All Headers", or one of the 
other options.

> What else might it preprocess out? 

It will also process other parts that contain images or other data files and 
display them appropriately, rather than showing you the raw data.  
"Appropriately" again depends on your configuration as well as on the 
Content-Disposition specified in the MIME part.  If you have it set to view 
attachments "inline", it will display images, for example, graphically in the 
message body.  If you have it set to "icons", it will just give you an 
appropriate icon for the MIME type.

Also, if the message contains PGP or S/MIME signature blocks, or has parts 
that are encrypted, KMail will attempt to do the right thing with those as 
well, displaying different colored borders to indicate the validity of the 
signature as well as the trust level of the signing key, rather than showing 
the (useless) raw data.  For example, you probably see this message with a 
yellow border, indicating that the signature is valid but that you don't 
trust it.  If your gpg was configured to trust my key, the border would be 
green.

This is a simplification; I haven't looked at the code but I'm sure the 
decision tree is much more complex.  The determination of how to display an 
e-mail message depends on how the message is structured, what message and 
message part headers say, what user preferences are set and what KMail thinks 
is safe.

	Shawn.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAX0oBp1Ep1JptinARAoAnAJ4zK25oOo9mwAjKkxyN9fGo7zWfeQCeISBk
p1rzdJ25KyXdHpSBeiLxFXk=
=aGlR
-----END PGP SIGNATURE-----
___________________________________________________
.
Account management:  https://mail.kde.org/mailman/listinfo/kde.
Archives: http://lists.kde.org/.
More info: http://www.kde.org/faq.html.