On 8/25/23 16:18, John R. Levine wrote:
On Fri, 25 Aug 2023, Keith Moore wrote:
whatever clever anti-spam scheme someone might suggest, I'm
confident I can tell you whre it's failed before.)
It's simple to take a stab at describing the problem, much more
difficult to identify a specific set of mechanisms that are usable by
ordinary people to let them control what kind of spam filters make
sense for them.
"Let people set their own filters" is indeed one of the approaches
that has failed many, many times before. When people ask for it, what
that really means is that the filters on the system they use aren't
very good. Improve the filtering and the demands for knobs and dials
go away.
wrong. The need for knobs will always be there because different kinds
of users require different filtering. Now, poorly chosen knobs will
definitely fail, too many knobs that users don't understand will
definitely fail, knobs that interact in strange ways will definitely
fail. SIEVE is not for ordinary users. etc. But I think most users
can correctly set some simple well-chosen policies. Users routinely do
similar things on Facebook for example, and appear to get the result
that they want, or at least close to it, a significant amount of the time.
("X has failed many times before" is an anti-pattern. Such statements
are only meaningful when X AND the conditions in which X was tried, and
"failure" are precisely defined.)
The closest you get these days is adjusting filter weignts when people
move stuff between the inbox and spam folder but even that doesn't
really work. I get tons of spam reports from users of large mail
systems about stuff that is clearly not spam, and when I ask they
usually deny having complained about it.
I've never thought that spam filter weights were worth anything. In
general, arithmetic expressions do not make for effective classification
of mail.
I absolutely agree that there's no purely technical way to
distinguish spam from ham, but it is possible to fairly reliably
place messages into specific categories that are useful for some
particular user. e.g. based on message content (what language?),
length, types of attachments, whether signed by a party known to the
recipient, whether the sender is known to the recipient, etc.
Sorry, but no. Spammers are not dumb, if you say these are the rules
to get your mail delivered, they will make their mail look like that.
I realize this sounds nihilistic but I spend a lot of time with people
who do this for a living.
I realize that spammers are clever. I don't think there is or should
be a set of rules to reliably get mail delivered in every case. I do
think it's possible to make specific kinds of classification such as I
mentioned above, more reliable.
Of course, a person who keeps looking for reasons to kill any kind of
innovation, will always find some. Problems aren't solved by people who
insist that there's no possible solution.
Also a set of recommendations for how to make email easily classified
on the recipient's end. e.g. which magic DNS records should be set
up, what kind of signatures should be used, how should those signers'
public keys be verified?
There are plenty of those. M3AAWG publishes some of them, no need for
the IETF to try and duplicate it.
Overall, I still maintain it's a black art. And saying that IETF can't
be part of the solution is effectively a DoS attack. Unless of course
IETF wants to delegate maintenance of email protocols to an organization
that will responsibly maintain them.
Once I lost a contract worth hundreds of thousands of dollars because
the would-be client's mail got caught by a spam filter, I realized
that spam filtering really needed to be something that could be
specified on a per-recipient basis.
I've lost stuff too, but I can assure you that hand tweaking the
filters would not guarantee getting the mail you want. In the extreme
case one might say deliver everything, in which case you won't even be
able to see the mail you want among all the crud.
Hand-tweaking the existing filters absolutely won't do that, otherwise I
would have done that years ago at least for my own mail. But right now
users have little choice (if they have any at all) about which mail to
receive between "give me all the mail sent to me" [*] and "only give me
what YOU think I should receive". Neither of which is really a good
choice for any serious user.
[*] actually they probably don't even get this particular choice,
because a lot of SMTP traffic is dropped based on source IP address
alone, without ever anything on the recipient's side looking at the content.
Keith