We are currently working on a prototype to identify spam blogs – splogs. Spam blogs can be really tricky to identify even to the human eye, as i-trepreneur.com writes in a recent post:
Why? These Splogs are user friendly. They were not made for search engines but for real visitors. There’s excellent design, well organized sections, working RSS feed. All the information on such Splogs is manually selected from the most popular resources on the net and is properly referenced. Only fresh content is used so it is not identified as duplicate instantly.
Pointing out that madconomist dot com and business-opportunities dot biz are two well made splogs which people are commenting and linking. I can’t tell by just looking at them with my bare eyes – so is’t spam huh? A later post on that philosophical aspect!
A prototype
We have set up a prototype to identify spam blogs. Right now it’s really rudimentary but shows potential. In the future by using clusters of classifiers hosted here at uclassify we think we can create a sufficiently good splog classifier.
Check out the project here, www.spamhuh.com. Remember that it’s only an early prototype!
Concerning the two hard to detect spam blogs above spamhuh.com is able to correctly identify one of them
Try it out and let us know what you think!!