Feedback, anyone?

One of the most popular published classifiers is Language detection which classified more than 800000 texts last week. However, this is just the top of the iceberg as most classifiers are unpublished (about 500 classifiers). All classifiers are of course not active however a good number are, which brings me to my question. How does it work? I’m not getting much feedback or support – I am not sure if this is good or bad.

If you read this and are using uClassify for a project feel more than free to contact me with any positive or negative feedback on any aspect (classifier performance, documentation, response times or any unclarity). You may leave a comment or e-mail me at this address: contact at uclassify dot com. <– Are spambots able to read this now days?

Over and out!

uClassify in the press

We’ve had hundred of thousands of mentions throughout the blogosphere and I’m really thankful for this!!! I’ll try to update this post as we go along. Please comment if you can help me with this list!

Here are some I remember/managed to find:


cnbc

CNBC blogs about Mattias Östmars genial invention Typealyzer, 2008-03-25.


Business Week

Business Week writes about Typealyzer, 2009-03-22.


Din Side

Norweigan newspaper test author recognition with uClassify, 2009-03-04.


Technology Review

Article about uClassify (only in paper version), 2009-01-01. Germans really seem to be interested in this kind of stuff!


ReadWriteWeb

Article about uClassify, 2008-12-07.


Kevin Kelly

Kevin Kelly mentions Typealyzer, 2008-12-05.


Doc Searls

Doc Searls mentions Typealyzer, 2008-11-30.

German Daily Taz
Another Genderanalyzer interview, 2008-11-12.


SuedDeutsche

Germanys biggest daily newspaper with a circulation of 450 000 copies. Genderanalyzer interview, 2008-11-10.


BoingBoing

Genderanalyzer is featured 2008-11-03.

Trve or Emo?

Another interesting uClassify web application has seen the light, Trve VS Emo. It tests a site if it’s “True Black Metal” or “Emotional”. The author, Albert Örwall writes

To “train” the trve-classification I’ve used lyrics by norweigan black metal bands, such as Mayhem, Burzum and Darkthrone. The emo-classification is based on lyrics by emo bands like My Chemical Romance and Fall out boy…

I tested with a hard rock blog I found randomly, Hard Rock Hideout which proved to be 81% Trve (true black metal). I then tested with this blog which turned out to be 100% Emo 🙂

Is there any need for automatic music tagging?

This is really cool, another cool thing would be a classifier that has been trained on texts from all genres (hip-hop, country, soul etc), this would not only be a fun way to test your blog it could also be used for automatic lyric tagging (hence track and album tagging). Does anyone know if there is any need for such a web service?

Using published classifiers

We’ve just implemented so that everyone with a uClassify account (free) can access public classifiers.

Once a classifier is published everyone can use it via the GUI or the web API and in return authors get a link to their website from everyone who use their classifiers. This should hopefully inspire more people to share their cool classifiers!

As an example of a published classifier check out the mood classifier by prfekt.se. Here is the list of all published classifiers.

Bloggparti.se – is text left or right wing? (Swedish)

A new site called bloggparti.se (only works for Swedish blogs/texts) using uClassify has spread through the Swedish blogosphere. The site takes a blog or text and tests it to see how it resembles to the major Swedish political parties.

Mattias Aspelund from 49lights.com created this classifier using 100 tagged blogs from each party. The site was created within 24 hours and had more than 1000 requests on the first day.

We think it’s very exciting to see how quickly people can build cool applications around uClassify. Self test sites seems to be very popular for bloggers, for example genderanalyzer.com went from 0 to Google Page Rank 6 in just three months.

I know there are more applications being built right now, looking forward to see those in action!

TrollGuard – protects your blog from spam comments

Me and Roger have just finished TrollGuard – an anti-spam plugin to WordPress 2.7 or later.

The plugin is in Beta and we are aware of some lacking features – however we would greatly appreciate if someone out there wanted to do some testing for us and come back with feedback!

This has been a small sideproject we did during our Christmas holidays using the uClassify API. We think it’s really cool that in less than a week we were able to setup a new Akismet service. Previous uClassify web applications have mostly been for entertainment, this plugin will acctually do something helpful – protect blogs from spam comments.

We are also confident in the accuracy of TrollGuard as similar classification technology has been used in Cactus Spam Filter since 2004.

Well now it’s up to you to test it! What isn’t working? What features are missing? Let us know!

Check TrollGuard out!

LibraryThing annouces uClassify competition

On LibraryThing you can add your own books to a personal library. By doing this you start to get recommendations from either other users who has read the same book or automatically by the system. There are also several forums where users can discuss books – just like a really really big book club. At the time I signed up there were over 34 million books added. I added a couple of books I have recently read and to my surprise all of them already existed in the system, even the Swedish ones. After adding them I was immediately getting lots of recommendations, such as “The Satanic Verses” and “Robot : mere machine to transcendent mind”. Really cool!

Now with all these books some kind of categorization could help.

Competition

LibraryThing are encouraging their users to create something cool with uClassify. The prize is $100 Amazon gift certificate and Toby Segaran’s “Programming Collective Intelligence”. LibraryThing also presents a couple of cool ideas which you can use such as fictional vs non-fiction. The competition ends on February 1 2009 so what are you waiting for?

Buzz & Development

Yesterday we were mentioned on ReadWriteWeb which generated a lot of visits and more importantly – classifiers. 30 new classifiers were created within a time period of 10 hours, even though many are just created out of curiosity to quickly test the system – some will hopefully mature and have web applications built around it.

What’s going on techwise

As you have noticed we are continuously improving our system by carefully adding new features. The following tasks are planned for the GUI

We are soon installing a new more flexible menu system.

Users will be able to create profiles with descriptions and links. Also classifiers should be able to have a link to the web site it’s implemented.

Better information about training – right now there is no feedback on how much training has been done or is required. We want to give users an idea of how the training data performs.

What’s going on commercialwise

Everything is free on uClassify and that is how it will stay.

Our commercial idea is to offer companies the possibility to buy their own classification servers. For large databases with texts that needs to be classified it’s intractable to send every text for a roundtrip to uclassify.com. Instead companies could be interested in doing this efficiently locally. A products page with server information will appear soon.

What’s your mood?

Today, 2 months after our launch, our users have created over 200 classifiers. Most are unpublished and under construction. PRfekt, the team behind the popular Typealyzer, recently published a new classifier that determines the mood of a text – whether a text is happy or upset. You can try it for yourself here!

So lets test some snippets!

Jamis is (justly) upset and writes:

Is anyone else annoyed by the “just speak your choice” automation in so many telephone menus? I feel like an idiot mumbling “YES!” or “CHECK BALANCE!” into my phone. Maybe it’s the misanthrope in me coming to the front, but I’d much rather push buttons than talk to a pretend person.

The mood classifier says 98.1% upset.

Spam is no fun either, or as Ed-Anger notes:

“I’m madder than a rooster in an empty hen house at Internet spammers and I won’t take it anymore. Those creeps clutter up my e-mail with their junk, everything from penis enlargement pills to some lady telling me she’ll give me a million dollars if I’ll help her get her money out of Africa. “Rush me 10 grand quick as possible and we’ll get the whole thing started,” she says.”

The mood classifier says 97.0% upset.

Now over to some happy blogs, amour-amour has a confesion:

“I love my iphone in a way I never thought possible!! When my fiance got his and spent 23 hours gazing at it lovingly, uploading (or is it downloading??) apps and buying accessories for it I put it down to him just being a technology geek.”

The mood classifier says 79.8% happy.

Finally Nitwik Nastik comments a Rickey Gervais:

“This is a hilarious stand-up routine by British Comedian Ricky Gervais on Bible and Creationism. It’s really funny how he ridicules the creationist stories from the book of Genesis (the book of genesis can be found here)and point out to it’s obvious logical blunders. Sometimes it may be difficult to understand his accent and often he will make some funny comments under his breath, so try to listen carefully.”

The mood classifier says 69.7% happy.

The author recommends at least two hundred words (more text than my samples) which seems reasonable!

Spam, huh?

We are currently working on a prototype to identify spam blogs – splogs. Spam blogs can be really tricky to identify even to the human eye, as i-trepreneur.com writes in a recent post:

Why? These Splogs are user friendly. They were not made for search engines but for real visitors. There’s excellent design, well organized sections, working RSS feed. All the information on such Splogs is manually selected from the most popular resources on the net and is properly referenced. Only fresh content is used so it is not identified as duplicate instantly.

Pointing out that madconomist dot com and business-opportunities dot biz are two well made splogs which people are commenting and linking. I can’t tell by just looking at them with my bare eyes – so is’t spam huh? A later post on that philosophical aspect!

A prototype

We have set up a prototype to identify spam blogs. Right now it’s really rudimentary but shows potential. In the future by using clusters of classifiers hosted here at uclassify we think we can create a sufficiently good splog classifier.

Check out the project here, www.spamhuh.com. Remember that it’s only an early prototype!

Concerning the two hard to detect spam blogs above spamhuh.com is able to correctly identify one of them :)

Try it out and let us know what you think!!