We are currently working on a prototype to identify spam blogs – splogs. Spam blogs can be really tricky to identify even to the human eye, as i-trepreneur.com writes in a recent post:
Why? These Splogs are user friendly. They were not made for search engines but for real visitors. There’s excellent design, well organized sections, working RSS feed. All the information on such Splogs is manually selected from the most popular resources on the net and is properly referenced. Only fresh content is used so it is not identified as duplicate instantly.
Pointing out that madconomist dot com and business-opportunities dot biz are two well made splogs which people are commenting and linking. I can’t tell by just looking at them with my bare eyes – so is’t spam huh? A later post on that philosophical aspect!
We have set up a prototype to identify spam blogs. Right now it’s really rudimentary but shows potential. In the future by using clusters of classifiers hosted here at uclassify we think we can create a sufficiently good splog classifier.
Check out the project here, www.spamhuh.com. Remember that it’s only an early prototype!
Concerning the two hard to detect spam blogs above spamhuh.com is able to correctly identify one of them
Try it out and let us know what you think!!
Creating your own classifiers has never been easier, we have developed a Click’n’Classify Graphical User Interface (GUI). This means that you can manually create and train your classifiers without knowing any programming at all. This is very good way to test an idea, if the classifier works well – build your web site around it or use it for whatever purpose.
The GUI allows you to do everything that you can do via our Application Programming Interface (API). Also, just like phpMyAdmin shows the SQL queries our uClassify GUI will show the XML queries so you can easily understand and use the API from your site.
- Create and remove classifiers
- Add and remove classes
- Train and untrain classes
- See basic information about your classifiers
Screenshot – Create a classifier
This shows a screenshot of how it looks like when you are about to create a classifier, just log in and try it yourself!
Screenshot – Training a classifier
Just copy and paste the texts you want to use as training data.
We are very happy to anouce that yet another site is using the uclassify web service! ofaust.com is a literature expert who finds out to which classical author a text resemble most. The developers let us know that it has been trained on over 80 different works of classical authors such as Plato, Shakespeare, Tolstoy and of course Goethe.
The beta is now up and running, please sign up create your own web site using cool classifications!
Today we are very pleased to announce the beta release of a new web service that allows everyone to access text classifiers for free. In short, by using a web api (e.g. google maps), everyone can create and train their own classifiers.
Two sites using the api already exists, be inspired and come up with your own classifiers
Typealyzer.com – Analyzes the personality of a blog author.
GenderAnalyzer.com – Figures out if a text is written by a man or woman.
During beta we will test the server for usability, stability, scalability and performance.
All comments and feedback are very appreciated!!
Jon Kågström, Roger Karlsson and Emil Kågström.