Can machine learning predict if a movie passes the Bechdel test?

To pass the Bechdel test

  1. The movie has to have at least two women in it,
  2. who talk to each other,
  3. about something besides a man.

Doesn’t sound so hard to pass, does it? This test was introduced by Alison Bechdel in 1985 in one of her comics, ‘The rule‘.

The largest database of movies that has been Bechdel tested is on The database contains over 6000 titles from 1892 up until now. How many percent do you think pass the Bechdel test overall? As I write this about 58% of the movies has passed the test. Statistics from here.

Being interested in machine learning and data I thought it would maybe be possible to find a textual correlation between movies that fail and pass the test.

To build a classifier that figures this out requires data. It needs labeled samples to learn from. It should be a list of films that passes and fails the test. The more the better. Then for each movie we need to extract features. Features could be the cover, the title, the description, the subtitles, the audio or anything that is in the movie.

Data & features

I was very happy when I found, it has a pretty extensive list exceeding 6000 movie titles with information of whether it passed the Bechtel test or not. Even better, it has a Bechtel test rating of 0-3, where 0) means it fails the first part of the test and 3) that it passes all tests.

Since I am dealing with text classifiers the natural choices for features were:

– The description

– The subtitles

– The title

The descriptions were retrieved using api which gets the plot from imdb. I retrieved plots from 2433 failed and 3281 passed movies.

The subtitles were a bit more cumbersome to find, I did use about 2400 movies selected randomly and spent some time downloading them from various sites. Pweii.

Finally the training data for the title was easily obtained by just creating samples with the only the movie titles for each class. In total 2696 and 3669 movie titles.


I setup an environment and ran 10-fold-cross-validation for all the data (train on 9/10 samples, test with 1/10 then rotate). For feature extraction I looked at case insensitive unigrams and bigrams.

I trained a classifier reading IMDB plots labeled whether or not the corresponding movie had passed the test.  The classifier turned out to have an accuracy of 67% .

By only reading the subtitles uClassify was able to predict whether or not a movie would pass an accuracy of 68%.

One classifier was trained to only look on the movie titles. The accuracy of the classifier was about 55% and this is not surprising at all considering how small the training dataset is.

Finally, I mashed together the subtitles and plots into one classifier that showed a slight increase in accuracy of 69%.

Dataset #Failed #Passed Accuracy
Plots 2433 3281 67%
Subtitles 1024 1262 68%
Titles 2696 3669 55%
Subtitles+plot 3457 4543 69%

The combined (subtitles+plots) classifier is available here, you can toss movie plots or subtitles (and probably scripts) at it and it will do its best to predict if it passes the Bechdel test or not.


The predictive accuracy of the classifier may not be the best, it certainly doesn’t figure out the 1-3 sub rules by just looking at unigrams and bigrams. But it does capture something to predict 70% correctly. I’m curious to find out exactly what it does make it decisions on and will make another blog post on this.

Update: Here is an analysis.


Much kudos to for maintaining their database. Thanks to for a simple to use API.