On Fri, 29 Sep 2006 11:41:34 -0500 (CDT), "Richard Lynch" wrote: > Consider that the user could provide *ANY* string, of any size, of any > composition, for their "Subject" > > Maybe they POST a worm in Subject, and it has no newlines, but still > manages to propogate through Outlook. > > Or maybe it's just a nice subject in Japanese. > > I know nada about Unicode, uuencode, and all that crap. > > Or, maybe, it's not even a VALID subject for SMTP, for whatever the > arcana rules of SMTP-ness are. > > My contention is that the lowly application developer (me) should not > need a degree in i18n nor SMTP just to pass on a valid SMTP subject in > an email. I've been meaning to look into this, so I might as well do it now. The obvious assumption would be that the mail() function would: a) escape all its arguments to make them suitable for use in an email This should, at a minimum, run something like addslashes() to escape newlines. Ideally, it should also encode the arguments as quoted-printable if necessary, but the mail() function would then need to know what character encoding the strings are in. So, the burden of escaping appears to be on the user, rather than on the mail() function. What then is it that has to be done? The 'Subject:' header is fairly simple. It contents is '*text' in RFC 822 terms, where problematic 'text' parts can be replaced by 'encoded-word' in RFC 2047 terms. In other words, it can be QP-encoded directly. The 'To:' header is more problematic. Is it an (RFC 822) 'address', 'mailbox', 'addr-spec' or something else? The escape mechanism would be different (I think), and the PHP documentation doesn't provide any information on this. b) escape all parameters sent to the sendmail program through the shell. It would be interesting to run a sendmail dummy that dumps its arguments to a file to see what's really going on. I don't seem to be able to test this on my old Windows clunker at home. > For *any* data that PHP has to pass back and forth in its "glue" there > are potentials for the kind of problems we've seen with spam, site > defacing, viruses, etc. > > What I'm suggesting is that in addition to mysql_escape[_real]_string, > maybe there needs to be more "escape" string functions. > > So with all these potential issues, I'm wondering if there isn't a > more systemic approach to this. Wouldn't it be nice if strings were associated with a (hidden) character encoding field, so that mail() and other functions could just "do their thing" and not bother us users with the detaily bits? Wait a minute... Isn't that what multi-byte string are for? The mb_send_mail() function looks like a strong contender here. And no, I haven't tried any funky mb_ stuff so far... > Plus, for the functions that we *DO* have, a grid of "from" and "to" > and the appropriate converter function seems like it would be a Good > Idea. > > It's all to easy to find a problem like ' where addslashes seems like > the "right answer" but, in reality, what I do not know is that ~ is > also a special character to the [mumble] extension/protocol/whatever > and I'm using the wrong escape function. > > There are 2 reasons why I'm not using the right escape function. > #1. The right one just plain doesn't exist. > #2. The docs, wonderful as they are, don't really lay out something as > fundamental as the right escape function for situation X, because you > need a degree in CS just to "know" that X is really a Y so the right > function is Z. Alas, there are many parts of PHP that is seriously underspecified. --nfe -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php