« Makten odlas via sällskap på nätet | Main | Orkut-statistik »

februari 19, 2004

Personal Email Networks: An Effective Anti-Spam Tool

Nature-artikeln Sorting e-mail friends from foes beskriver en teknik att fånga spam som bygger på mottagarens sociala nätverk.

A simple and easily implemented scheme for combating e-mail spam has been devised by two researchers in the United States.

The technique exploits the structure of social networks to quickly determine whether a given message comes from a friend or a spammer. The method works for only about half of all e-mails received - but in all of those cases, it sorts the mail into the right category.

Papret som refereras är P. O. Boykin, V. Roychowdhury: Personal Email Networks: An Effective Anti-Spam Tool
Abstract:
We provide an automated graph theoretic method for identifying individual users' trusted networks of friends in cyberspace. We routinely use our social networks to judge the trustworthiness of outsiders, i.e., to decide where to buy our next car, or to find a good mechanic for it. In this work, we show that an email user may similarly use his email network, constructed solely from sender and recipient information available in the email headers, to distinguish between unsolicited commercial emails, commonly called "spam", and emails associated with his circles of friends. We exploit the properties of social networks to construct an automated anti-spam tool which processes an individual user's personal email network to simultaneously identify the user's core trusted networks of friends, as well as subnetworks generated by spams. In our empirical studies of individual mail boxes, our algorithm classified approximately 53% of all emails as spam or non-spam, with 100% accuracy. Some of the emails are left unclassified by this network analysis tool. However, one can exploit two of the following useful features. First, it requires no user intervention or supervised training; second, it results in no false negatives i.e., spam being misclassified as non-spam, or vice versa. We demonstrate that these two features suggest that our algorithm may be used as a platform for a comprehensive solution to the spam problem when used in concert with more sophisticated, but more cumbersome, content-based filters.

Se även P. Oscar Boykin.

Posted by hakank at februari 19, 2004 01:27 FM Posted to Social Network Analysis/Complex Networks

Comments

Det är en bra idé om man utsträcker det till vänners-vänner och kanske t.om ett steg till. Sina närmaste kontakter har man förmodligen redan i sin adressbok och kan lätt identifieras av klienten som godkänd post.

"our algorithm classified approximately 53% of all emails as spam or non-spam, with 100% accuracy."

Betyder det att den valde "vet inte" på resterande 47%?

/Lars.

Posted by: Lars at februari 19, 2004 04:18 EM

Enligt sid 2 i papret kunde de resterande 47%-en inte klassificeras eftersom de tillhörde subnätverk som var för små för att ge någon statistisk tillförlitlig klassifikation.

Posted by: Håkan Kjellerstrand at februari 19, 2004 07:50 EM