Most natural language processing tasks like this will require complex libraries, but we'll just ask an AI to provide some simple word frequency counts of the pronouns 'he', 'him', 'his', 'she', 'her', and 'hers'.
Here's an example of a prompt we could use:
using javascript in the inspector console, do a word frequency count on the contents of a web page for the terms "he", "him", "his", "her", "hers", and "she" and provide results as a table
ChatGPT produced the following:
// Define the target words
const targetWords = ['he', 'him', 'his', 'she', 'her', 'hers'];
// Function to count the occurrences of target words in text
function countWords(text) {
const wordCounts = {};
const words = text.toLowerCase().split(/\s+/); // Tokenize text by spaces
for (const word of words) {
if (targetWords.includes(word)) {
wordCounts[word] = (wordCounts[word] || 0) + 1;
}
}
return wordCounts;
}
console.table(countWords(document.body.textContent))
For War and Peace, the results are as follows:
And here are the results for Anna Karenina:

If we take these results and summarize them in a spreadsheet, we get the following:
| War and Peace | Anna Karenina |
---|
| Frequency | % | Frequency | % |
---|
Masculine as Subject | 9,298 | 74.25% | 7,305 | 63.72% |
Feminine as Subject | 3,225 | 25.75% | 4,159 | 36.28% |
Masculine as Object | 10,663 | 72.83% | 6,858 | 63.21% |
Feminine as Object | 3,977 | 27.17% | 3,992 | 36.79% |
From this, we see that males seem to drive the narrative of each book, with a rate of 3-to-1 for War and Peace and 2-to-1 for Anna Karenina, despite the fact that the latter has a woman as the titular character.
Does this pattern continue in other classical works written by women, like for instance, Jane Austen or Emily Bronte?