Monday, November 30, 2009

Removing the spam in the signal? Helping the good Samaritans

I'm beginning to feel like this is a catchall blog for topics that don't fit anywhere else... How to justify this entry in this place? Well, spam doesn't care about justifying its entry into every possible place. Epistemology is about the assessing the significance of information, while spam is about destroying any significance of the information it impersonates. Much of my interest in epistemology has been driven by my desire to understand the nature of "good", but I already understand that spam is bad. Ergo, it seems that the rationalization is that spam is the enemy of epistemolgy. Or should we just regard spam as the enemy of everything?

However the pragmatic purpose of this blog entry is to describe a proposed countermeasure against spam for ease of reference...

The key to this suggestion is to leverage some of the essential characteristics of spam against the spammers. First, most of the recipients of the spam do NOT want to receive it. Second, the spammers can't obfuscate their spam beyond human understanding because they are searching for the very few humans who will send them money.

The suggestion is to provide a better spam complaint tool so that the people who don't like spam can fight spam more effecively. This is not really a cast-in-concrete proposal, but just one of the possible implementations. I'm focusing on Gmail as a familiar example, but this proposal doesn't need to involve all of the email systems to have a strong effect against spam. We just need to tip the economic scales against the spammers. When the spammers stop making money, the volume of spam will decline. QED.

What I suggest is basically a kind of iterative analysis system that would combine cheap computer analysis with the human intelligence of the MANY people who hate spam. Most of the actual programming is quite straightforward for anyone who is fluent with regular expressions.

Right now Gmail has three main anti-spam options. One is the "Report spam" button for false negatives. There is also a corresponding button for false positives. These two buttons are (presumably) used for tuning the filters (which mostly just motivate the spammers to crank out more spam). The third option is the "Report phishing" option that Gmail (probably) uses for higher priority measures against some of the worst criminals among the spammers.

The new fourth option might be called "Analyze spam" to be used by the more serous 'wannabe spam fighters' (WSFs). The basic idea would be similar to SpamCop, but with deeper analysis and more iterations. The countermeasures would be targetted against the ISPs who are permitting the spam, against the websites and hosting services of those websites, against the domain registrars who are assisting the spammers, and against any other co-conspirators or accomplices. There should be options for other affected parties, as when a bank's name is being used for phishing and the bank might want to take countermeasures or when a company's valuable name is being used to sell counterfeit merchandise and the company wants to be notified to track the spam against their reputation. In each phase of analysis, the WSF will answer the questions that the computer can't deal with, making the decisions about what kind of spam it is and what kind of complaints will be most effective. After several iterations of analysis and after collecting the human decisions from the WSF, then the proper complaints and notifications would be sent.

Remember the two key point here: (1) Because there are lots of people who hate spam and very few people who will respond with money, it is certain that there will be many good Samaritans who can try to block the spam before the sucker can send the money--and the more accessible this system is, the more likely it is that the spammer will be blocked. (2) Because the spammer cannot obfuscate beyond human understanding without losing the target victims are human it is certain that the cloud of WSFs provide the system with a large pool of human intelligence directed against the spammers. Also, the human intelligence can allow for open-ended responses so the WSFs could immediately report when the spammers try to come up with a new wrinkles--because the new wrinkles still need to be open to human intelligence or the spammer can't get any money.

Yes, a WSF would have to make various decisions during the process, but this would NOT be on Google's nickle, but would just be the personal donation of some time to fighting the spammers. Actually, this system should also track the analytic performance of the human participants so Google would soon know who is really good at figuring out how to stop the spammers. In the context of Gmail, this would obviously be linked to the Gmail account that has received the spam.

In summary, this system would be a tool to make it easier and more effective for the many people who dislike spam to fight against spam. This will tip the scales against the spammers so that they will stop making money from their spam, and therefore the spam problem will be greatly reduced.

No comments:

Post a Comment