Finally, Tim Peters of the Spambayes Project proposed a way of generating a particularly useful spamminess indicator based on the combined probabilities. an approach based on the chi-square distribution for combining the individual word probabilities into a combined probability (actually a pair of probabilities-see below) representing an e-mail. I took his approach for generating probabilities associated with words, altered it slightly and proposed a Bayesian calculation for dealing with words that hadn't appeared very often. Paul Graham, an author of books on Lisp, suggested an approach to filtering spam in his on-line article, "A Plan for Spam". The approach described here truly has been a distributed effort in the best open-source tradition. Robinson commented in Linux Journal on how fighting spam was a collaborative effort: Robinson's method was used in other anti-spam projects such as SpamAssassin. SpamBayes assigned probability scores to both spam and ham (useful emails) to guess intelligently whether an incoming email was spam the scoring system enabled the program to return a value of unsure if both the spam and ham scores were high. Spamming is the abuse of electronic messaging systems to send unsolicited, undesired bulk messages. The method became the basis for anti-spam techniques used by Tim Peters and Rob Hooft of the influential SpamBayes project. The technique had wide applicability for example, Robinson's method enabled computers to examine a file and guess, with much greater accuracy, whether it contained pornography, or whether an incoming email to a corporation was a technical question or a sales-related question. Robinson's method used math-intensive algorithms combined with Chi-square statistical testing to enable computers to examine an unknown file and make intelligent guesses about what was in it. In 2003, Robinson's article in Linux Journal detailed a new approach to computer programming perhaps best described as a general purpose classifier which expanded on the usefulness of Bayesian filtering. He is credited as being one of the first to use automated collaborative filtering technologies to turn word-of-mouth recommendations into useful data. The patent was bought by DoubleClick, and then DoubleClick was bought by Google. In addition, he patented a method to use web browser cookies to track consumers across different web sites, allowing marketers to better match advertisements with consumers. Gary Robinson is an American software engineer and mathematician and inventor notable for his mathematical algorithms to fight spam. SpamBayes, SpamAssassin, Recommendation engine, Collaborative filtering
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |