On Sun, 7 Mar 2004, Michael Thomas wrote: > Paul Hoffman / IMC writes: > > At 8:19 AM -0800 3/6/04, Michael Thomas wrote: > > >So... instead of pointing out the obvious that > > >there is no silver bullet, wouldn't it be a lot > > >more productive to frame this debate in terms of > > >what incremental steps could be taken to at least > > >try to change the overall climate? > > > > Only if such framing includes the costs of the steps. To date, most > > of the initial proposals we have seen on this (and many other) lists > > have three attributes in common: > > > > - They don't list the obvious problems > > > > - They don't even guess at the costs of those problems > > > > - They don't have an analysis of how hard or easy it will be for > > spammers to adapt to the proposal > > Fine. Truth in advertising is wonderful. Then > what? From what I can tell, anything that falls > short of perfection then gets summarily > executed. What metrics do you suggest when the > answer is less than perfect that doesn't result in > paralysis? That seems to be the real breakdown > here. There is no real breakdown here, and perfection isn't the issue. A proposal doesn't have to be perfect; it has to be realistic and not obviously flawed. It seems fairly obvious that any serious proposal for anything, let alone a complex problem such as spam abatement, should include a feasibility and cost/benefit analysis. This is SOP throughout business, government, academe, engineering -- why should IETF proposals and discussions be exempted from this? Vernon is pointing out that most of the discussion on this topic on this list in the recent past has omitted these components, and propose solutions over and over again that have either been proposed in the past but rejected as infeasible or expensive or that have been TRIED in the past, are implemented now, and that are not provING (now, in real time) to be tremedously effective in preventing spam. In a previous reply his remark about some of the proposals being "innumerate" was dead on the money -- in most cases a very simple analysis of the actual numbers demonstrates that a proposed measure, after being implemented at great expense and inconvenience, will only affect a tiny fraction of the problem (for example) or will not have any effect at all. There are several things one should accept in any discussion of spam abatement. The first and foremost (one that might well go at the very head of the "principles" statement we were discussing last week) is that there MAY BE NOTHING THAT THE IETF CAN DO at the protocol level to control spam, at least not directly. If you prefer this phrased in a prettier way, it may be that any measures that WOULD result in an abatement of spam are all cures that are worse than the disease, either because of astronomical costs or because they would necessitate removing some desired/fundamental property from email (such as the ability to receive mail from strangers without a complex dance that would be even more annoying and stultifying to electronic communication than spam is). My memory isn't what it used to be (and it was never very good) but here is a short list of what I have heard proposed recently as ways of abating spam (and in some cases, other forms of network abuse such as viruses as well): a) Add a "cost" per message. Bill Gates himself came out in public favor of this in the newspaper over the weekend. (A cynical public is invited to wonder why.) Pros: "Some people estimate" that a cost of as little as $0.01/message would deter spammers. [Who these people are and why their guess is any better than mine remains unsaid. I personally note that costs of anywhere from a dime to a dollar plus the hassle of having to physically handle paper, envelopes, postage do not seem to have the slightest effect on the direct advertising fraction in my real mailbox on a daily basis, with a persistent noise (advertising) to signal (all other forms of communication combined) ratio that easily exceeds 2:1.] "It is believed" (by these same people) that everyday users won't mind paying the cost in time or money because they don't send much mail. Cons: I don't want to pay any cost per message. I don't want to solve a puzzle to send mail. I don't want to have to solve a puzzle eight thousand times to send mail via a list. I don't want to have to manage a cost-based apparatus. The freedom of the Internet is far more valuable to me than spam abatement and this is a cure worse than any disease. Note that I'm just giving MY response to this proposal. I send twenty or thirty pieces of mail a day and there are other cheaper methods of controlling spam. Finally, I strongly suspect that the "people" who are estimating that cost will deter spammers are at least in some cases people who stand to make money hand over fist charging it. The fundamental premise here seems to be that we are more able and willing to pay a higher cost for mail than spammers, in spite of the fact that they actually MAKE money from email and I just use it for communications with no obvious or direct monetary yield and have to pay for it with real money, not a fraction of the income generated by the activity itself. I can only conclude that this proposal is promoted by individuals who don't actually have to try to get money (from a grant agency, from a corporate budget, from their own pockets) in order to pay for mail-related infrastructure and/or by individuals who hope to profit by collecting some of those added costs. b) Require all mail to be electronically signed. In some cases encrypted as well, imagining encryption to be a "cost" that might deter spammers. Signature/encryption schema vary, tools required to enable actual authentication of said signatures left vague. Pros: A signature permits you to positively identify mail from friends and people you know that are not spammers, IF you have an authority or agent capable of managing the large and widely distributed database of signature keys involved. Encryption (as a separate issue) prevents email from being read in transit, and requires a few thousandths of a second of modern CPU per KB of payload on both ends. It has similar key database/certification requirements. Cons: I >>already<< can identify mail from friends and people I know with 99.99% accuracy. The "From" header from these individuals is generally completely correct. I cannot recall receiving a single piece of spam, ever, with a header forged so that it appears to come from a friend of mine. I have received a small but nonzero number of viruses that way (small because most of my friends use linux and hence are not susceptible to most current header-forging viruses). Encryption obviously doesn't add a sufficient cost to deter spammers, and elementary arithmetic indicates that it (or associated code-driven delays) will NEVER add a cost that would deter spammers before it also deterred all sorts of legitimate uses of mail and cost a fortune in wasted resources at Internet scale. Encrypted mail already is available. People can already sign their mail digitally if they wish (and many do) and manage keys as best they can. Neither measure seems likely to impact spam, because spam is sent by strangers, and strangers are perfectly capable of electronically signing or encrypting their mail to me - I just won't recognize the signature because they are strangers. I'd expect this to have absolutely no impact on spam at all besides making my internal whitelist whiter (a difficult concept in a binary decision). c) Wait! Spam comes from strangers, right? So we'll require all mail to come from people you know, or people you "consent" to receive mail from! And naturally, you won't consent to receive any spam! Pros: Well, hard to argue with this one. If I only consent to receive mail from people I know, or mail from strangers that isn't spam, won't that abate the spam problem? Kind of a tautology, that... Cons: Yes, and if only the Palestinians and Israelis would lay down their arms, open all their borders, convert to Zen Buddhism and embrace one another in a big love-fest, it would abate the problem they have with killing each other too. This is a prime example of a proposal without any sort of internal reality check or CBA. First off, I consider the abilty to receive mail from strangers an essential feature of email, not a bug, and will oppose (by ignoring in any local implementation) any proposal that would remove that feature. A cure clearly worse than the disease. I assume all of you agree, after all NONE of you know me -- I just up and signed up for the list (self appointed, un-vouched for). I recognize just one name among the posters from other lists I'm on (Joel:-). I could insert a line like "Visit my website. Read my books of poetry there. Send me money." and the most vigilant of you would have just received spam (or DID you just receive spam, heh, heh, hard to tell...;-). Second, any implementation that LEAVES IN this ability has an associated problem in pure logic. A stranger sends me mail. Do I consent to receive it? If no, then I'm rejecting mail from strangers (cure worse than disease). If yes, then I can be spammed if the stranger happens to be sending me spam (as my stealth spam above clearly proves). End of story. Note that the fact that I don't "consent to receive spam" is irrelevant until somebody invents a psychic 100% efficient spam filter, and if we had one of those we wouldn't be having this discussion. Note that one of many reasons this is a useless proposal is that email identities are cheap, easily changed, readily available from multiple sources, and in constant churn for legitimate reasons (e.g. changing ISPs, getting a new account, seeking anonymity). If they weren't, we wouldn't NEED the proposal as blacklisting individuals would suffice. Blacklisting me as rgb@xxxxxxxxxxxx is easy because I'm NOT a (real) spammer -- I've had the same email address for well over a decade now. However, if I clicked over to yahoo...I could probably spam the list once a day "forever" in spite of your best efforts to stop me. d) OK, so we can't control individuals as there are order of a billion of them and millions of them change identities on a given day. What about controlling networks? Only accept mail from "clean" networks. Pros: "Consent" applied at the network level (via white and blacklists) is already in fairly common use. I personally believe that tightening up the regulation of networks might well "help" abate the spam nuisance. Not a magic bullet, but improving AUPs and enforcement of same and clearly requiring SPs to police their networks in specific ways might, actually, help. This is very much a matter for open debate, however, as one has to show in any proposal how it will both alter and improve what we already achieve here. Cons: Consent at the network level IS in common use, and it may be that we've already gotten all the benefit it can yield. There is a time lag problem here as well -- blacklists are often trying to "catch up" with the rapidly changing spammer identities. Network identities aren't a lot more expensive or difficult to change than individual identities, and some superlarge domains (e.g. yahoo, hotmail) are effectively impossible to blacklist (much as I'm sure many of us have been tempted to do so:-) because there are too many friendly strangers mixed in with the evil spammers that abuse their services. There are also complex legal issues to resolve. AUPs tend to be actual contracts and have to be dickered out by lawyers. Enforcment is not cheap, which is why many providers throw up their hands and refuse to deal with the problem or blame somebody else. Some SPs may have a vested interest in NOT controlling the problem, as they profit (indirectly) from spammers working through their domains. Still, this DOES seem to me at least to be a place where the IETF might make some small contribution, perhaps by working out a clean partitioning of the responsibility that everybody seems to want to avoid and getting it written into future AUPs from the top down, possibly by integrating this process with e). e) Ah, so the networks are avoiding the responsibility of dealing with they spammers advertently or inadvertently provide network access to. How about if we write some laws and regulations REQUIRING them to deal with the problem with fines and other penalties for noncompliance? While we're at it, how about if we whack spammers upside the head with all sorts of laws and penalties? THERE'S a way of adding real costs to those that profit from spam. Pros: In my opinion, this is almost certain to prove the most effective way to abate spam in the long run. One proven model is the national DNC list, which has been near-miraculous in its effectiveness in abating phone spam (even MORE expensive, recall, than paper mail, although the $1-2/message cost didn't deter phone spammers from sending two to ten messages a day to my household, see a) above). This adds a very real "cost" to spamming -- fines and the risk of jail time -- and forces spammers into the same operational zone as virus hackers. It permits the actual pursuit of spammers through networking barriers. It adds costs to SPs that "enable" spam (or fail to police it) as well. Cons: Spam is as international as the Internet itself, making enforcement much more difficult than it first appears. Laws are great but have to be enforced, which requires individuals to complain, police to act, DAs to act, juries to act. Action takes time and is a significant cost: legal measures are slow and can be very costly. Finally and perhaps most important, regulatory laws ALSO reduce the freedom of the Internet itself and may well prove to be a cure worse than the disease in the long run. Nibble away on freedom of speech here, and somebody seeking control over public discourse will work out a way of taking a bite out there. This approach seems to be gradually moving forward of its own accord, driven by considerable public dissatisfaction with spam. Prosecutions are occuring. Little attention has been paid as yet to SP responsibilities and liability, but it may well be that a few successful prosecutions of spammers and/or enabling service providers provides the "spark" that brings about self-regulation by the rest of the SPs if only to keep the nose of the very smelly government regulatory camel out of their tents. "Clean up your act or we'll clean it up for you" may prove to be the deciding argument in this particular dialogue. f) Filters. Pros: In near-universal use already, at least in some venues. Can be very, very effective in abating spam (easily 90% and up). This is literally as close as one can come to an automated implementation of a real "consent" model (which requires examination of content one way or another, regardless of whether or not a document is signed, encrypted, certified). Cons: Its a war. There is an absolutely unavoidable (mathematically grounded) conflict between effectiveness (percent of false negatives, spam that gets through) and undesireable side effects (percent of false positives, real mail that gets rejected as spam). Spammers try to fool the filter and filter writers try to catch the spam. Incorporating the filter into the MTA to ameliorate the false positive problem has advantages (the hope that the message sender learns that the message didn't get through and can resend or send an out of band message to open a hole for the filtered message) and disadvantages (the user can filter some more in their MUA, the spammer can literally probe the MTA filter looking for holes in the filter algorithm, additional load on mail servers). Filtering is all about cost-benefit at many levels. This is a very robust and dynamic solution, and is unlikely to go away unless/until things like legal measures and improved AUPs ameliorate the problem (if they ever do). It can be implemented by individuals at the user level. It can be implemented by sysadmins at the domain level. Filters and other intelligent agents COULD be implemented by SPs at the transport level to identify clients that are spamming, and if it ever WERE implemented at this level and the SPs came down on AUP violators like a ton of bricks with contractual monetary penalties, the spam problem might really significantly abate. I believe it was Vernon (again) who pointed out that if tier 1 and 2 providers really wanted to regulate the problem and were willing to "own" it they could do it now. Most of the detailed solutions thus far proposed fit into one or more of the categories above. Some require universal registration of one sort or another of individuals or additional registration components for networks. Some require certification authorities. Some require integrated costing agents or challenge/response systems to be inserted into (I suppose) MTAs. Little discussion of costs on the part of the proposers, often wildly optimistic benefit claims. In summary, in my opinion, for whatever it might be worth (quite possibly nothing:-): a) A silly solution. There isn't any reason to believe that adding scaled costs to spammers will deter spam (cost doesn't deter advertising anywhere else in our lives at costs up to $1/message, so why should it deter spam at costs less than this). Adding costs WILL deter us from using mail at a far lower threshold than it deters spammers. At $0.10/message, I'd be spending $3-5/day on mail, depending on how lists are billed. If lists are billed per message, I'd be spending hundreds of dollars a day at $0.01/message, as I'm on some pretty big lists. Costs in time (puzzle solving) are just as bad and discriminate against children and stupid people and handicapped people and list administrators. b) A silly solution to continue to promote AS A SOLUTION TO SPAM and IN PROTOCOL as part of e.g. a new MTA. That is, by all means sign messages if you like. Create new signature aware mail clients or graft the capability in to old ones. Use encrypted mail transport. Create distributed signature databases and tools. Just don't suggest that it will have any impact on spam and don't "force" me to use them, make them so attractive that I WANT to use them, entirely separate from the spam issue. After all, I can't see why this would abate spam in the slightest on purely logical grounds. I have no more of a way of assessing mail from a stranger that signed it than I do of assessing mail from a stranger that didn't sign it -- either way I have to open the envelope and filter it and/or look at it. c) Don't make me laugh. I don't consent to spam NOW, and insist on being able to receive mail from real strangers without the slightest hint of prior introduction or a priori common ground. I just cannot tell what is spam and what isn't without looking, and given order of a billion (several hundred million and growing) strangers and corporate entities with access to mail and given that I can set up a dozen effectively "anonymous" mail accounts in the time it has taken me to write this response I'll NEVER be able to give consent on a person by person basis without opening the envelope. If I have to open the envelope, this adds NOTHING to the anti-spam filtering measures already represented in f). Note that a) through c) I think are so ill-conceived that I (at the moment) cannot see how they could ever be made into proposals worth taking seriously. They all fail a cost-benefit analysis (high cost up front, likely to have little or even no discernable effect on spam). So sure, keep bringing them up and I (and others) will keep pointing this out. This isn't "being negative". It is "not wasting time and money on a complex measure that, in the end, won't have any useful effect on spam". Remember, I think that there may not BE a solution to spam, in the sense that people seem to be looking for. There aren't any perpetual motion machines either. These two problems may well be linked (information theory is a common foundation). So to me, saying that a "solution" that is obviously flawed is obviously flawed is a CONSTRUCTIVE thing. On to the good news. d) seems worth pursuing. At least in the sense that AUPs are one of the effective impediments to spam NOW -- one of the things that largely keeps it from originating within academic networks, for example -- and thus there is a real possibility that a better schema for spam regulation at the network level could result in a significant, enduring, abatement of spam. Here I think that the IETF could be a very positive force and provide real guidance (in the form of specifications for software that might be used by SPs to detect patterns of abuse originating within their boundaries, for example, and possibly with legal work on contract templates that would permit them to impose financial penalties). Note also that "could" does not mean "will" -- this is a reasonable place to TRY to find a better solution, but one may not exist or may be too costly or infeasible for other reasons. e) Time will tell, but again worth pursuing. Largely outside of the IETF's purview, though, except as a sort of "amicus curiae" and insofar as it integrates with work done on d). f) The ONLY solution (aside from AUPs in widespread use against spam), and in many cases a very effective solution. It has the advantage of being evolutionary (responsive to changes in the strategy of spammers and a changing AUP and legal landscape) and of requiring no higher-level consent or approval (from e.g. the IETF) to implement on any level from the individual to the domain. A variety of free filtering agents are available as are a variety of commercial filtering agents, and there is a healthy degree of competition and market choice. I don't see the preemminance of filtering as an anti-spam measure changing, and don't REALLY see much of a role for the IETF in its continuing evolution. Obviously this is bad news, possibly even unacceptably bad news, to folks disenchanted with filtering as anti-spam measures. Bad news or not, it is reality at the moment and quite possibly really is the best we can do short of legislation or better enforcement at the AUP/SP level. Even things such as signatures, white/blacklisting of networks or individuals, inclusion of various tokens (solutions to puzzles, e.g.) are likely to become just another component, and not necessarily a very powerful component, in the multidimensional decisioning process of such a filter. Once the mail has left the point of origination, ONLY a filter (including the human brain used as a filter) can examine content, and ONLY content determines whether a message, however else you might flag it, authenticate it, sign it, white or black list it, vouch for it, is or isn't spam. AT the point of origination this is very nearly true as well, but there the controlling agents are SPs or the government as they are far away from you personally in space and time. Perhaps statements "urging" the integration of suitable filters with the MTA where this is possible are reasonable, I don't know -- arguments presented here were moderately persuasive that while this wouldn't affect their effectiveness against spam, the improved treatment of false positives would permit a lowering of the threshold that identifies a piece of mail as spam and favorably alter the false negative/false positive numbers. This could be so, although I'd worry a bit about exposing the filter so directly to the spammers to be probed for weaknesses unless this were accompanied by other measures that might limit such probing (such as "immediate" exposure to detection followed by rejection from the network and/or prosecution). In conclusion, I have seen almost nothing in the entire spam abatement discussion that can be taken seriously BECAUSE the proposals do not include the steps described by Vernon. It would be lovely if future proposals did, in fact, list BOTH the pros (what one hopes to gain from the proposal to abate spam) AND the cons (the "obvious problems"). It would be lovely if they were "numerate" (actually used numbers, backed up by observation, measurement, real data) where relevant. It would be smashing if they had a cost-benefit analysis where the benefits (real or imagined) of the pros were contrasted with the costs of implementation and of dealing with the cons. It would be nice if the numerate part of the cost-benefit analysis included the scaling issues -- for example, the costs of imposing a user-level solution on a half-billion or so mail users using a dazzling array of platforms and operating systems as opposed to the costs YOU experience hacking something together for yourself on top of an open source operating system where the solution comes to you prebuilt for free. To add a requirement to Vernon's list, it might also be nice to see what in any proposal is actually new -- we've seen at least three or four notes proposing "solutions" that are not only not magic bullets, they are already implemented to little or no effect on spam. Finally, it does seem like an analysis of whether or not a proposed solution would stand the test of time is in order. For example, I can go to my reject/spam folder du jour and -- perhaps -- find some key phrase that is present in all spam. I can then put that phrase into my filter definition and say "Voila! I have solved the spam problem!" And for a day, that might even be true, and if I kept it a dark secret it might last a month or more. However, can you imagine me proposing this phrase as THE solution for EVERYBODY on this list? Just publishing it here would ensure that it wouldn't last the day. For a solution to be realistic, one has to be able to at least argue persuasively that the first hacker who comes along won't be able to route around your "impervious" obstacle, that your obstacle will have a real effect on a whole class of spam that cannot easily be overcome by someone with access to the same data that you have. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@xxxxxxxxxxxx