Do males and females express themselves differently in text? Yes is the answer if we look at the research carried out at the University of Texas, in the article “Effects on age and Gender on Blogging” [1] it’s found that author gender can be determined with an accuracy of 80% by looking at a text. This is achieved with a classifier, trained on 37478 blogs written by males and females at blogger.com.
Gender stereotypes in the blogosphere
The research also shows the most discriminating terms for males of females (using information gain).
Male favorite words
– linux
– microsoft
– gaming
– server
– software
– gb
– programming
– google
– data
– graphics
– india
– nations
– democracy
– users
– economic
Female favorite words
– shopping
– mom
– cried
– freaked
– pink
– cute
– gosh
– kisses
– yummy
– mommy
– boyfriend
– skirt
– adorable
– husband
– hubby
They conclude “Male bloggers of all ages write more about politics, technology and money than do their female cohorts. Female bloggers discuss their personal lives – and use more personal writing style – much more than males do.”
Try it on your blog
GenderAnalyzer.com uses the same approach as described in the article, they have collected 2000 blogs from blogger.com written by men and woman. They also have a poll which allows us to see how well it’s working, as we speak it has an accuracy of 70%.
Trying this blog in the analyzer gives us the correct answer
Results
We think http://blog.uclassify.com is written by a man.
[1] J. Schler, Moshe Koppel, S. Argamon and J. Pennebaker (2006), Effects of Age and Gender on Blogging, in Proc. of AAAI Spring Symposium on Computational Approaches for Analyzing Weblogs, March 2006. PDF