On Sun, Mar 30, 2008 at 04:40:53PM +1300, Sam Vilain wrote: > > My point is that we don't _know_ what is happening in between the decode > > and encode. Does that intermediate form have the information required to > > convert back to the exact same bytes as the original form? > No, it doesn't. If you want that, save a copy of the string (it's a > lazy copy anyway). We do already save a copy. The question is that Robin is proposing decode/encode to check for validity. It was not clear to me that such a process would always return the exact same bytes even for valid utf-8. But it seems like you are saying below that it is really just the "decode" part of that which is interesting: > utf8::decode works in-place; it is essentially checking that the string > is valid, and if so, marking it as UTF8. > > my ($encoding); > if (utf8::decode($string)) { > if (utf8::is_utf($string)) { > $encoding = "UTF-8"; > } > else { > $encoding = "US-ASCII"; > } > } > else { > $encoding = "ISO8859-1" > } OK, that was the magic invocation we were looking for. Thank you. > For US-ASCII, you'll only have to encode if the string contains special > characters (those below \037) or any "=" characters. Ah, yeah. I think our tests are lacking in that they check for only [^[:ascii:]]. > Anyway, I guess all this rubbish is why people use CPAN modules, so that > they don't have to continually rediscover every single protocol quirk > and reinvent the wheel. > > ie, it would be much, much simpler to use MIME::Entity->build for all of > this, and remove the duplication of code. Yes, I actually made a similar comment recently. send-email could probably be shorter, easier to read, and have fewer bugs if it used one of the many mail-handling CPAN modules. I think it would pretty much involve scrapping the current send-email and starting fresh, though. Thanks for your input. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html