• Becoming an eLearning Hacker
  • Glossary
  • Regular Expressions (Regex)
  • Projects
  • Challenges
  • Essential Tools
  • HTML
  • CSS
  • JavaScript
  • Web Scraping
  • Learning Analytics
  • Images
  • Download
  • Translations
  • Challenge 4

    Analyze Pronoun Usage in Tolstoy's Novels

    JavaScript

    Background

    Advocates for women's rights and representation in arts and literature regularly point out that women are often treated as subordinate to men or as taking supportive roles to men in books, movies, etc. Can we find evidence for this in classical literature? What about cases where women are the title characters of novels?

    Task

    Compare pronoun frequency and usage in two of Tolstoy's most well-known works, War and Peace and Anna Karenina, to see if masculine and feminine characters have similar treatment in each work. The full text of each work is available as HTML through Project Gutenberg.

    Most natural language processing tasks like this will require complex libraries, but we'll just ask an AI to provide some simple word frequency counts of the pronouns 'he', 'him', 'his', 'she', 'her', and 'hers'.

    Here's an example of a prompt we could use:

    using javascript in the inspector console, do a word frequency count on the contents of a web page for the terms "he", "him", "his", "her", "hers", and "she" and provide results as a table

    ChatGPT produced the following:

    // Define the target words
        const targetWords = ['he', 'him', 'his', 'she', 'her', 'hers'];

        // Function to count the occurrences of target words in text
        function countWords(text) {
          const wordCounts = {};
          const words = text.toLowerCase().split(/\s+/); // Tokenize text by spaces

          for (const word of words) {
            if (targetWords.includes(word)) {
              wordCounts[word] = (wordCounts[word] || 0) + 1;
            }
          }

          return wordCounts;
        }

    console.table(countWords(document.body.textContent))

    For War and Peace, the results are as follows:


    And here are the results for Anna Karenina:

    If we take these results and summarize them in a spreadsheet, we get the following:

    War and PeaceAnna Karenina
    Frequency%Frequency%
    Masculine as Subject9,29874.25%7,30563.72%
    Feminine as Subject3,22525.75%4,15936.28%
    Masculine as Object10,66372.83%6,85863.21%
    Feminine as Object3,97727.17%3,99236.79%

    From this, we see that males seem to drive the narrative of each book, with a rate of 3-to-1 for War and Peace and 2-to-1 for Anna Karenina, despite the fact that the latter has a woman as the titular character.

    Does this pattern continue in other classical works written by women, like for instance, Jane Austen or Emily Bronte?

       

    This content is provided to you freely by EdTech Books.

    Access it online or download it at https://edtechbooks.org/elearning_hacker/analyze_pronoun_usage_in_tolstoys_war_and_peace.