Foundational Concepts

Before we get into the details of planning and creating educational assessments, we need to define and discuss a few essential concepts, terms, and definitions.

A key aspect of assessment involves understanding the concepts of learning and intelligence.

Intelligence

Intelligence is the ability to reason, understand, feel, and perceive. It encompasses various cognitive processes, including logic, memory, comprehension, and creativity. We use our intelligence to solve problems, adapt to new situations, and achieve goals. And while intelligence is clearly a cognitive process, Howard Gardner (1983) suggests that some physical abilities might also be considered a form of intelligence.

Gardner's Multiple Intelligences Categories

Intelligence Type

Description

Logical-mathematical

The capacity for logical reasoning, problem-solving, and working with numbers and abstract concepts. Individuals with high logical-mathematical intelligence excel in mathematics, science, and programming. People with this type of intelligence also do well with writing, which involves exposition, argumentation, definition, classification, and analysis.

Verbal – Linguistic

The ability to use language effectively for communication. This intelligence includes reading, writing, and speaking skills, as well as being sensitive to words' meanings, sounds, and rhythms. People with this type of intelligence like reading, poetry, tongue twisters, puns, humor, puzzles, and riddles.

Bodily-kinesthetic

This intelligence involves using one's body skillfully and handling objects dexterously. People with high bodily-kinesthetic intelligence excel in sports, dance, and crafts. This intelligence is often seen in athletes, dancers, surgeons, and craftspeople. They enjoy dramatics, role-playing, dancing, and physical expression.

Musical

The capacity to recognize, create, and reproduce music and rhythms. Individuals with high musical intelligence have a strong sense of rhythm, pitch, and melody. This intelligence is evident in musicians, composers, and those with a strong sensitivity to sound patterns. They appreciate music and enjoy creating and performing musical numbers.

Visual-Spatial

The ability to think in three dimensions, visualize spatial relationships, and manipulate objects in space. This includes skills in geometry, architecture, and visual arts. People with high spatial intelligence are skilled at navigation, visual arts, and solving puzzles. They respond to visual cues and like inventing and designing.

Interpersonal

The ability to understand and interact effectively with others. This includes having a strong sense of empathy and the ability to discern the moods, motivations, and intentions of others. Individuals with high interpersonal intelligence are social. They enjoy being with their peers and working cooperatively with others.

Intrapersonal

The capacity for self-awareness and self-reflection. This includes understanding one's own emotions, motivations, and inner states. People with high intrapersonal intelligence have a strong sense of their emotions, values, and goals. They are often self-motivated and may like to work independently and use their self-awareness to guide their behavior and decision-making.

Naturalistic

The ability to recognize and categorize plants, animals, and other aspects of the natural environment. Individuals with high naturalistic intelligence are skilled at understanding and working with plants, animals, and natural phenomena. This intelligence is evident in biologists, environmentalists, and those with a strong affinity for nature and the outdoors. They typically appreciate nature and like to be outside.


While we value many types of intelligence, in a school setting, we often focus primarily on assessing logical-mathematical and verbal-linguistic intelligence. Measuring multiple intelligences requires various assessment methods because each type of intelligence is distinct and manifests itself differently. Traditional tests often focus on linguistic and logical-mathematical abilities, which will not accurately capture an individual's strengths in other areas. Educators must utilize diverse approaches to comprehensively assess multiple intelligences, including objectively scored traditional tests, performance assessments, and alternative assessments like observations and self-report interviews.

Measuring Intelligence

Because intelligence varies by degree between individuals, we can measure an individual's ability, skill, and knowledge. These measures can be used to compare one person's intelligence with another. We can also use these measures to identify increases in intelligence and ability.

Learning

An essential aspect of intelligence that is important in assessment is the fact that intelligence varies by degree from one individual to another. Humans also have the capacity to increase or develop their intelligence and ability (i.e., learning). This means we can assess the degree to which an individual has a certain form of intelligence, ability, or skill.

You will not be surprised to find that there has been much debate on the topic of learning. Two specific views of learning are represented by the behaviorist and constructivist viewpoints. Both have merit and impact the ways we assess learning and intelligence.

The behavioral view of learning, also known as behaviorism, is a psychological perspective that emphasizes the role of observable behavior in the learning process. According to this view, learning is defined as a change in behavior that occurs as a result of an individual's interaction with their environment. Behaviorism emphasizes the role of reinforcement and punishment in shaping behavior, which is equated with learning.

In the behavioral view, learning occurs through the association between a stimulus and a response, where desirable behaviors are reinforced and undesirable behaviors are discouraged. Reinforcement can be positive (presenting a rewarding stimulus) or negative (removing an aversive stimulus). Both types of reinforcement increase the likelihood of the behavior being repeated, leading to learning. Conversely, punishment aims to decrease the frequency of unwanted behaviors by introducing negative consequences, which is also considered a form of learning.

Behaviorists measure learning through observable changes in behavior, focusing on the measurable outcomes of the learning process rather than internal mental processes. They argue that learning occurs when a demonstrable change in an individual's behavior results from their interaction with the environment.

The behavioral perspective of measuring learning focuses on our ability to observe and measure changes in behavior. This approach involves setting specific behavioral objectives, conducting baseline assessments, reinforcing students' abilities through instruction, collecting quantifiable data as evidence of learning, and evaluating learning against predetermined criteria or performance standards. Continuous assessment is employed throughout the instructional process to monitor progress and provide feedback.

While the behavioral view of learning and its measurement has been influential, it has also faced criticism for its reductionist approach to intelligence and its lack of attention to the complex cognitive and social factors that influence learning. Critics argue that this perspective may overlook critical internal processes and the role of individual differences in the learning experience.

The constructivist view of learning is a psychological perspective that emphasizes the active role of the learner in constructing their own knowledge and understanding of the world. According to this view, learning is not a passive process of absorbing information but rather an active process of meaning-making, where individuals build upon their prior knowledge and experiences to create new understandings. Unlike behaviorism, which relies on reinforcers and punishments to shape behavior, constructivism posits that students have a natural inclination to learn, which needs to be directed, encouraged, and guided by educators.

Constructivists argue that knowledge is not simply transmitted from teacher to student but must be constructed by the learner through their interactions with the environment, social relationships, and cognitive processes. In this context, the learner is seen as an active participant in the learning process, actively exploring, questioning, and interpreting information to build their own mental models and theories. The constructivist view of learning emphasizes the importance of social interaction, collaboration, and the negotiation of meaning in the learning process.

In the constructivist view, measuring learning is a complex and multi-faceted process that focuses on assessing the learner's understanding, application, and creation of knowledge. Constructivists believe that learning is an active process of constructing meaning, and therefore, assessment should be designed to capture the depth and quality of a learner's understanding rather than simply measuring the acquisition of facts or skills. Because learning cannot always be observed directly, constructivist approaches to measuring learning focus on having students demonstrate their learning and ability to reason and think by assessing the depth of understanding and the ability to apply knowledge in meaningful ways, rather than merely evaluating the recall of facts.

Bloom's Taxonomies

Bloom's Taxonomy categorizes learning objectives into three domains: Cognitive, Affective, and Psychomotor, each focusing on different aspects of learning and skill development.

Cognitive Domain: This domain deals with knowledge outcomes, intellectual abilities, and skills. It involves mental processes such as remembering, understanding, applying, analyzing, evaluating, and creating. The focus is on developing cognitive skills and the ability to process and use information effectively. This is the domain most prominently assessed in schools. 

Affective Domain: This domain focuses on attitudes, interests, values, emotions, and social norms. It deals with how individuals develop personal perspective and values; how they become who they are. Key areas include receiving (awareness), responding (participation, gaining perspective), valuing (attachment), organizing (resolving and prioritizing values), and becoming (internalizing values). While educators find many of these dispositions important for students to develop, they are difficult to measure and are therefore encouraged in schools but rarely measured. 

Psychomotor Domain: This domain addresses motor skills and physical abilities. It involves the development of physical coordination, movement, and the use of motor skills. Learning in this domain progresses from basic motor skills to complex physical activities, emphasizing the physical execution of tasks. Some of these skills are taught in schools, but rarely do we take care to measure them in schools. Often, grades for subjects like physical education, music, and drama/dance classes are, if offered, only superficially graded and more often scored as a participation grade. 

Although schools typically focus on the cognitive domain, these three domains provide a comprehensive framework for understanding and assessing various aspects of learning and human development.

Bloom Taxonomy for the Cognitive Domain is a hierarchical model of cognitive learning objectives developed by Benjamin Bloom and his colleagues in the 1950s. The taxonomy provides a structured framework for categorizing educational goals and learning objectives into six levels of increasing complexity: knowledge, comprehension, application, analysis, synthesis, and evaluation. These levels are organized in a cumulative hierarchy, with each level building upon the skills and abilities acquired in the previous levels. However, there is some debate about whether the three higher levels actually have a hierarchical relationship.

In 2001, a revised version of Bloom's Taxonomy was published, which updated the terminology and structure of the original taxonomy. The revised taxonomy includes six cognitive processes: remembering, understanding, applying, analyzing, evaluating, and creating. This revision emphasizes the active nature of cognitive processes.

Bloom's Taxonomy has significantly impacted educational practice and is widely used to foster comprehensive and progressive learning experiences. It helps educators design learning activities that promote higher-order thinking skills and a deeper understanding of subject matter, ensuring that students develop a range of cognitive skills, from basic recall of facts to the ability to make judgments and evaluations based on criteria and standards.

Cognitive objectives. These objectives measure thinking skills. They deal with lower-level thinking skills like knowing facts and understanding concepts. As well as high-level cognitive abilities like analyzing, critical thinking, synthesizing ideas, and evaluating. Tests that measure cognition require learners to recall, explain, compare, justify, and produce logical arguments (see Figure 1). In schools, cognitive assessments are common. The data from these assessments are used to evaluate student learning and judge the effectiveness of educational products.

Figure 1: Bloom's Taxonomy for the Cognitive Domain.

Performance objectives. These abilities typically fall in the psychomotor domain and must be measured using performance assessments. However, some skills, like writing, overlap with the cognitive domain in that they provide evidence of higher-level cognitive abilities. Tests that measure performance require learners to demonstrate their ability to perform a task or skill, not just know how to do it (i.e., the steps required). Behaviors or abilities that require performance assessments might include reading, speaking, singing, writing, cooking, or doing some other clearly defined task (i.e., a job). These assessments are subjectively scored and require the use of a rubric. The rubric (or scoring guide) outlines essential aspects of the skill and the criteria for judging competence. This is best done with expert reviewers or judges. For example, the ability to speak a foreign language should be tested with an oral proficiency interview. Those administering the test should be experts in the language and trained in the assessment’s administration protocols. If the goal is to test a person’s oral communication skills, it would not be acceptable to have a student pass a vocabulary test, or a reading test, then declare them a capable speaker. They must be able to speak fluently, have an adequate vocabulary, and respond appropriately (i.e., intelligently) to prompts.

Affective objectives. Affect refers to the personal feeling—one’s beliefs, opinions, attitudes, dispositions, and emotions. When evaluating educational products, we usually attempt to measure satisfaction (an emotion). However, at times, the objective of a particular educational program is to help students develop specific dispositions or attitudes. A course curriculum may have the goal of developing students’ character or generating a specific perspective. For example, educators prefer that students develop an internal locus of control (i.e., a belief that their efforts make a difference). Having this attitude is believed to result in better effort on the part of the student and, as a result, more learning. However, with most affective characteristics, you cannot simply ask someone to tell you what they feel directly. You must measure the degree to which they hold a specific attitude or opinion using a scale. Some scales used to measure specific affective characteristics already exist, others would need to be created.

Important Terms and Definitions

It is challenging to communicate about assessment if we don't have a correct and shared understanding of some basic terms. Many of these terms are misused, which can lead to confusion.


Objective and Subjective scoring: People often mistakenly believe it is better to use objective rather than subjective scoring because it can reduce potential bias. However, there are times when subjective scoring is required. Objective scoring simply means that experts are more likely to agree on the correct answer. Objective scoring typically involves using multiple-choice or fixed-choice response items. This way of scoring test items is cost-effective, highly reliable, and efficient. Computers and untrained individuals can easily score these test items. However, a criticism of objectively scored tests is they tend to overemphasize the testing of factual knowledge and low-level cognitive ability, as it is difficult to create objectively scored test items that measure higher-level learning objectives. In addition, many test forms (e.g., performance assessments) require subjective scoring. Subjective scoring is needed when there is more than one way to correctly answer a question. Subjective scoring is best done by experts who have a good understanding of the criteria and expectations of the skill or ability being tested. These tests can be time-consuming to grade as various responses may be deemed adequate. Experts need to subjectively judge the adequacy of the response. Subjective scoring is most effective if it is based on expert opinion.


Speed and power testing represent two distinct approaches to evaluating a student's ability. Speed testing focuses on how well a student can complete a given task within a limited amount of time. It measures the ability to quickly and accurately process information or complete a task, highlighting skill development or automaticity of thinking skills and immediate recall of basic facts.

On the other hand, power testing allows students ample time to complete a task, focusing more on their overall ability to understand and apply concepts, solve problems, and deliver high-quality work. These tests may still be timed, but sufficient time is provided to allow students to demonstrate their full potential and depth of understanding.


Norm-referenced and criterion-referenced tests are two distinct methods of assessment that differ in their primary focus and how the results are interpreted and reported.

Norm-referenced tests aim to compare a test-taker's performance to that of a group of individuals who have taken the same test. This type of testing is primarily used to determine differences between individuals so they can be ranked based on their relative performance within the group. Norm-referenced tests are commonly used when the goal is to identify high and low performers, such as in college admissions or talent identification programs.

In contrast, criterion-referenced tests evaluate an individual's performance against a predefined standard, criteria, or expectation rather than comparing them to others. These tests measure a person's proficiency or mastery in a specific area, determining whether they have acquired the required knowledge or skills. The focus is on whether the individual has met the established benchmark rather than their relative standing among peers. Criterion-referenced tests are often used in educational settings to assess the accomplishment of specific learning outcomes or in certification exams to ensure that individuals have the necessary competencies to perform a specific job or task.

While some tests are designed to have both a norm-referenced and a criterion-referenced component, each test is constructed, and the results are reported differently. The design of these tests differs in the following ways:

Norm-referenced tests are designed to maximize differences among individuals. These tests often cover a broad domain of learning tasks, with only a few items measuring each specific task. This approach allows the test to capture a variety of abilities for those taking the test. Test designers typically exclude items that everyone will answer correctly or incorrectly (i.e., very easy or extremely difficult items), preferring items of average difficulty. This strategy maximizes the test's ability to identify individual differences. These tests require a well-defined norm group so that a student's performance can be understood in the context of an individual's relative standing within that group. The results are often reported as a scale score, Z-score transformation, percentile rank, or grade equivalent score.

It is important to note that just because an individual performs well compared to others does not necessarily mean that their performance was adequate (i.e., met an expected standard). However, if the test has a well-defined norm group, the results can be used to identify a range of typical performance, which can then be used to establish a criterion for satisfactory performance.

Criterion-referenced tests are designed to measure an individual's performance against a predefined standard or criterion. These tests focus on specific learning goals, standards, or competencies that students are expected to achieve. The items are selected to assess mastery of those objectives. The difficulty level of the items is less important than the test adequately measuring essential knowledge and skills.

The results of criterion-referenced tests are typically reported as a score indicating the percentage of items answered correctly or as a pass/fail status based on a predetermined cut-off score. The report indicates whether students have met the expected standard. A pass criterion needs to be established, which may differ for different groups of students taking the same test. Examples of criterion-referenced tests include end-of-unit exams in a classroom or certification exams like a driving test.


 

A Standardized Test is an assessment that requires all students to take the same exam under the same conditions. These assessments are scored in a "standard" or consistent manner, making it possible to compare the relative performance of individuals or groups of students. Key features of standardized tests include:


Traditional, Alternative, and Authentic Assessments

Traditional assessment refers to conventional methods of evaluating student learning, typically using objectively scored, paper and pencil (or computer-administered) tests, quizzes, and exams. These assessments often focus on measuring students' achievement using multiple-choice, true/false, matching, and alternative response questions but may also include short-answer and essay questions. The primary purpose of traditional assessments is to provide a quantifiable measure of student achievement, which describes a student's knowledge and abilities.

Traditional assessments can be either standardized or classroom-based. They can serve summative or formative purposes. And while traditional assessment methods have been widely used and can provide valuable information about student learning, they have also faced criticism for their limitations. One oft-cited criticism is that traditional assessments focus on a narrow range of skills and knowledge, primarily those easily measured through objectively scored test items. Traditional assessments are often associated with the lower levels of Bloom's Taxonomy, including knowledge, comprehension, and application. However, it is important to note that traditional tests can be designed to measure higher-level learning through the use of short answers, essays, context-dependent items, and carefully constructed multiple-choice (best answer) test items. These types of questions require students to demonstrate deeper understanding, analysis, evaluation, and critical thinking skills.

Alternative assessments are a broad category of evaluation methods that differ from traditional assessments in their approach to measuring student learning and performance. Often, when people refer to alternative assessment methods, they mean any testing method other than a traditional test, including allowing students to demonstrate their learning through various means, such as oral presentations, performances, portfolios, projects, or other creative works. This approach provides opportunities for them to showcase their abilities in diverse ways.

Sometimes, alternative assessment is called authentic assessment, although not all alternative tests are truly authentic. While alternative assessment aims to assess student learning in a student-centered real-world context, often the best we can do is simulate real-world (i.e., authentic) assessment situations.

Key characteristics of alternative assessments include:

1.      Authentic tasks: Alternative assessments often involve tasks that resemble real-world situations or problems, allowing students to demonstrate their knowledge and skills in a more meaningful and relevant context.

2.      Student-centered: Alternative assessments often involve greater student participation and ownership in the evaluation process. Students may have a role in selecting the tasks or projects they will complete, establishing the criteria by which their work will be judged, and engaging in self and peer assessment.

3.      Ongoing and formative: Many alternative assessments are designed to be ongoing and formative, providing students with regular feedback and opportunities for improvement throughout the learning process. This approach helps students to monitor their own progress and make adjustments as needed. Alternative assessments tend to focus on the learning process rather than just the final product. This means that students are often assessed on their ability to plan, research, collaborate, and revise their work, in addition to the quality of the final output.

4.      Higher-order thinking: Alternative assessments often challenge students to engage in higher-order thinking skills, such as analysis, synthesis, evaluation, and creation. These skills are essential for success in the 21st century and are not always effectively measured by traditional assessments.

5.      Subjectively Scored: Alternative assessment requires a subjective scoring mechanism. This means rubrics and scoring guides must be created to assign scores, although not all alternative assessments will provide a score.   

 


Chapter Summary

  • Intelligence encompasses cognitive processes such as reasoning, memory, comprehension, and creativity.
  • Howard Gardner's theory of multiple intelligences identifies various types, including logical-mathematical, verbal-linguistic, bodily-kinesthetic, musical, visual-spatial, interpersonal, intrapersonal, and naturalistic intelligence.
  • Schools often focus on assessing logical-mathematical and verbal-linguistic intelligence, but a comprehensive assessment should include multiple forms of intelligence using diverse methods.
  • Intelligence varies among individuals; because of this, the degree to which individuals possess specific types of intelligence can be measured.
  • The two C’s: A primary purpose of assessment is to communicate the amount of intelligence an individual possesses and compare this to others.
  • Traditional tests often focus on linguistic and logical-mathematical abilities, which may not capture strengths in other areas, necessitating varied assessment methods.
  • Various learning theories impact the way we assess student learning:
    • Behaviorist View: Emphasizes observable behavior changes through reinforcement and punishment, focusing on measurable outcomes.

    • Constructivist View: Stresses the active role of learners in constructing knowledge through interactions and experiences, emphasizing the depth and quality of understanding.

    • Bloom's Taxonomy categorizes learning objectives into three domains: cognitive, affective, and psychomotor.

  • There are various domains that we can measure with assessment:
    • The cognitive domain, often assessed in schools, involves mental processes like remembering, understanding, applying, analyzing, evaluating, and creating.

    • The psychomotor domain addresses motor skills and physical abilities, but is less frequently measured in traditional educational settings.

    • The affective domain involves feelings, attitudes, and dispositions, but is also less frequently measured in traditional educational settings.

  • Objective vs. Subjective Scoring: Objective scoring involves clear, agreed-upon answers, while subjective scoring requires expert judgment and can assess higher-level cognitive abilities.
  • Speed vs. Power Testing: Speed tests focus on quick task completion, while power tests allow ample time to demonstrate deep understanding.
  • Norm-Referenced vs. Criterion-Referenced Tests: Norm-referenced tests compare performance to peers, while criterion-referenced tests measure proficiency against predefined standards.
  • Traditional assessments use standardized, objectively scored tests and are common in educational settings.
  • Alternative assessments, often more student-centered, include methods like projects, portfolios, and performances.
  • Authentic assessments aim to evaluate real-world application of skills and knowledge, focusing on higher-order thinking and meaningful tasks.

Discussion Questions

  1. How might educators balance the use of traditional and alternative assessments to provide a comprehensive evaluation of student learning? Consider the strengths and limitations of each approach, and discuss how they could be combined effectively in various educational contexts.
  2. How can assessment practices be adapted to fairly evaluate students with different types of intelligence? What challenges might educators face in implementing such a diverse assessment strategy?
  3. How might these contrasting perspectives between behaviorist and constructivist views of learning influence the design and implementation of assessments? Discuss the potential impacts on both formative and summative assessment practices in educational settings.
  4. How can educators effectively incorporate all three domains of Bloom's Taxonomy (cognitive, affective, and psychomotor) into their assessment practices? Discuss the challenges and benefits of assessing beyond the cognitive domain.

References

Anderson, L. W., & Krathwohl, D. R. (Eds.). (2001). A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives. New York: Longman.

Bloom, B. S., Engelhart, M. D., Furst, E. J., Hill, W. H., & Krathwohl, D. R. (1956). Taxonomy of Educational Objectives: The Classification of Educational Goals. Handbook I: Cognitive Domain. New York: David McKay Company.

Gardner, H. (1983). Frames of Mind: The Theory of Multiple Intelligences. New York: Basic Books.

This content is provided to you freely by EdTech Books.

Access it online or download it at https://edtechbooks.org/Assessment_Basics/Test_Plans.