We have created a new service called UrlAi.com, the basic concept is to run blog posts through a bunch of classifiers over time. To begin with we use Gender, Age, Mood and Tonality but the system is dynamic so we can add new classifiers at any time. If you have created a classifier that would fit on urlai.com let us know!
We have many ideas of how we can develop this project further, for example, now we are only showing a summary pie chart, it would be nice to see posts over time. User feedback for online training and classifier improvement may be possible. Another thing we could do is to have classified posts searchable, for example, enabling users to see the mood of everyone who mentioned ‘Avatar’.
Just want to thank the people that has been involved in this project, Roger Karlsson for coding, Johanna Forsman for the awesome logo and Mattias Östmar for sharing his Tonality and Mood classifiers. Mattias has also contributed with many ideas around this, being the idea fountain he is 😀
We have just released ageanalyzer.com, a site that reads a blog and guesses the age of the author!
Our writing style reflects us in many ways, for example texts written in anger probably differs from words written in joy. Reading a text intuitively gives us a clue about the author as you start forming a picture in your head. Sometimes it’s easy to pinpoint how you got this picture and at other times harder.
We wanted to know if we could give computers the same intuition, in this particular project we are interesting in finding out if a computer can tell the age of an author – only given a text.
To do this experiment we collected 7000 blogs that had age information in the profile and split it into 6 different age groups, 13-17, 18-25, 26-35, 36-50, 51-65 and 65+. We then created a classifier on uClassify and fed it with the training data. Viola!
After running tests on the training data (10-fold-cross-validation) it was clear that our classifier was able to find differences between the six age groups. We expect the proportion of correctly classified blogs would be around 30% compared to a baseline of 17% which would be expected if the classifier was guessing out of the blue.
We have added a poll to the site to help us see how well (or poorly) it works!
Try AgeAnalyzer out here!