Re: [Tools-discuss] messaging formatting follies, was The IETF's email

Keith Moore <moore@xxxxxxxxxxxxxxxxxxxx> · Fri, 25 Aug 2023 16:33:22 -0400

On 8/25/23 16:18, John R. Levine wrote:
On Fri, 25 Aug 2023, Keith Moore wrote:
whatever clever anti-spam scheme someone might suggest, I'm 
confident I can tell you whre it's failed before.)

It's simple to take a stab at describing the problem, much more 
difficult to identify a specific set of mechanisms that are usable by 
ordinary people to let them control what kind of spam filters make 
sense for them.

"Let people set their own filters" is indeed one of the approaches 
that has failed many, many times before.  When people ask for it, what 
that really means is that the filters on the system they use aren't 
very good. Improve the filtering and the demands for knobs and dials 
go away.

wrong.  The need for knobs will always be there because different kinds 
of users require different filtering.   Now, poorly chosen knobs will 
definitely fail, too many knobs that users don't understand will 
definitely fail, knobs that interact in strange ways will definitely 
fail.  SIEVE is not for ordinary users.  etc.     But I think most users 
can correctly set some simple well-chosen policies.   Users routinely do 
similar things on Facebook for example, and appear to get the result 
that they want, or at least close to it, a significant amount of the time.

("X has failed many times before" is an anti-pattern.  Such statements 
are only meaningful when X AND the conditions in which X was tried, and 
"failure" are precisely defined.)

The closest you get these days is adjusting filter weignts when people 
move stuff between the inbox and spam folder but even that doesn't 
really work.  I get tons of spam reports from users of large mail 
systems about stuff that is clearly not spam, and when I ask they 
usually deny having complained about it.
I've never thought that spam filter weights were worth anything.  In 
general, arithmetic expressions do not make for effective classification 
of mail.

I absolutely agree that there's no purely technical way to 
distinguish spam from ham, but it is possible to fairly reliably 
place messages into specific categories that are useful for some 
particular user.   e.g. based on message content (what language?), 
length, types of attachments, whether signed by a party known to the 
recipient, whether the sender is known to the recipient, etc.

Sorry, but no.  Spammers are not dumb, if you say these are the rules 
to get your mail delivered, they will make their mail look like that.  
I realize this sounds nihilistic but I spend a lot of time with people 
who do this for a living.

I realize that spammers are clever.   I don't think there is or should 
be a set of rules to reliably get mail delivered in every case.   I do 
think it's possible to make specific kinds of classification such as I 
mentioned above, more reliable.

Of course, a person who keeps looking for reasons to kill any kind of 
innovation, will always find some.  Problems aren't solved by people who 
insist that there's no possible solution.

Also a set of recommendations for how to make email easily classified 
on the recipient's end.   e.g. which magic DNS records should be set 
up, what kind of signatures should be used, how should those signers' 
public keys be verified?

There are plenty of those.  M3AAWG publishes some of them, no need for 
the IETF to try and duplicate it.

Overall, I still maintain it's a black art.   And saying that IETF can't 
be part of the solution is effectively a DoS attack. Unless of course 
IETF wants to delegate maintenance of email protocols to an organization 
that will responsibly maintain them.

Once I lost a contract worth hundreds of thousands of dollars because 
the would-be client's mail got caught by a spam filter, I realized 
that spam filtering really needed to be something that could be 
specified on a per-recipient basis.

I've lost stuff too, but I can assure you that hand tweaking the 
filters would not guarantee getting the mail you want.  In the extreme 
case one might say deliver everything, in which case you won't even be 
able to see the mail you want among all the crud.

Hand-tweaking the existing filters absolutely won't do that, otherwise I 
would have done that years ago at least for my own mail.   But right now 
users have little choice (if they have any at all) about which mail to 
receive between "give me all the mail sent to me" [*] and "only give me 
what YOU think I should receive".  Neither of which is really a good 
choice for any serious user.

[*] actually they probably don't even get this particular choice, 
because a lot of SMTP traffic is dropped based on source IP address 
alone, without ever anything on the recipient's side looking at the content.

Keith