Histories and Foundations of Assessment

Introduction

<If we don't write an introduction to the book itself, we could include a brief overview and description of the book here, then go into the purpose of this chapter as it relates to the book. Otherwise, this chapter and its outline combines the chapters "Assessment and Instructional Design," "Historical Perspectives," and "The Role of Assessment." We will want to think about how we can shift language specific to IDs to include teachers/educators/instructors. OR if we need to do that all. Maybe we clarify that one hat teachers often have to wear is that of instructional designer, and go from there. I prefer the latter.  - CS>

This chapter is meant to accomplish two objectives:

  1. Explain some foundational principles concerning what is and is NOT assessment, answering the question, "what do we assess"?
  2. Provide a brief historical context and discussion about assessment to answer the question, "why do we need assessments"?

Spoiler for objective one: Everything, or nearly everything, can be seen as an assessment of some kind.

Before we get started, it is important that we clarify what we mean by "assessment." If we don't have a shared understanding of this term, we can quickly devolve into thinking that everything under the sun is an assessment of some kind. And while there may be some truth to this statement, this book focuses on assessments that are part of instructional design and practice. To begin, let's consider the differences between "evaluation" and "assessment."

Evaluation and Assessment

We need to understand the difference between these two terms – Assessment and Evaluation. These related terms are often used interchangeably, but they have distinct meanings and purposes.

Assessment refers to the process of gathering information about an individual's knowledge, skills, abilities, or other characteristics. Assessment often requires that we create instruments (e.g., tests) to measure these characteristics. However, assessment can take other forms, such as observations or interviews. The primary purpose of assessment is to gather accurate, often quantitative, information about an individual so we can communicate and compare results.

Evaluation, on the other hand, refers to the process of making judgments or decisions based on the results of an assessment. An evaluation aims to make value-based judgments about an individual's performance or cognitive ability; this often requires we establish evaluation criteria.

The difference between these two terms is subtle. Assessment is descriptive, while evaluation involves judgment. Assessment is the process of gathering information, while evaluation is the process of making decisions based on the results of an assessment. An assessment becomes an evaluation when we make a determination about an individual based on assessment results.

Assessment, Grading, and Testing

While the terms "assessment" and "evaluation" may often be used interchangeably, there is an important distinction between them. Likewise, there are subtle differences between the ideas of "assessment," "grading," and "testing."

Again, assessment refers to a broad process of gathering information about an individual. Assessment can be formative or summative, occurring before, during, or at the end of an established time frame or unit of learning. Assessment is something that is both part of the learning process and the end of the learning process. It provides educators and instructional designers with the opportunity to (a) check for understanding, (b) provide feedback to learners, (c) differentiate instruction, and (d) guide decisions about future instruction. The role of assessment throughout the learning process is what makes it different than grading.

Grading is the process of assigning numeric scores, letter grades, or other scales to our measurements. For example, in the United States of America, it is common to assign an assessment with a score of 90%, the measurement of an "A" or "A-". Grades are meant to accurately communicate an individual's proficiency to various stakeholders - learners, parents, instructors, administrators, etc. While grading and assessment often go hand-in-hand, much of the assessment that occurs during the learning process should not be assigned a grade because the learning is still formative. If grades are assigned during the learning process, learners might be afforded the opportunity to return to those assessments and revise them for higher scores, and instructors should revise such grades based on later assessment. Final grades for a unit, module, or course should reflect an individual's final levels of proficiency without being impacted by the individual's proficiency throughout the learning process.

Both assessment and grading become evaluations when you judge the quality or adequacy of the performance. Grading, for example, becomes an evaluative process when you set a passing score (i.e., what is good enough) or determine the letter grade a student must achieve in order to pass the course. 

In order to gather data used for assessment and grading, instructors usually use tests. Testing is simply the use of instruments or tools to measure or determine an individual's knowledge, skills, abilities, or other characteristics. As discussed in later chapters, these tests can take on various forms and vary in degrees for formality, scope, and use.

Types of Assessment

<There are three main types of assessments, but these can also take on many different forms...>
Cognitive


Affective


Psychomotor

Purposes for Assessment

<Can we introduce performance assessments somewhere in this section as well? Or maybe it is introduced as part of the Types of Assessment above?

Additionally, I'd like to see some mention of Affective Assessment, Self Assessment, and Peer Assessment.>

Assessment serves multiple purposes in education, including:

Measuring Student Learning: Summative  assessments measure achievement, enabling teachers to determine what students have learned (accountability) and verify they have accomplished the expected learning outcomes (certification). These types of assessments are most often evaluations.

Informing Instructional Planning: Formative assessments help teachers make informed decisions about the instructional needs of their students. The results of a formative assessment can help teachers plan the scope and focus of their instruction.

Assessing Readiness and Need: Placement assessments are a form of formative assessment that helps teachers determine a student's readiness for the planned instruction or whether a student needs to participate in the proposed instruction.

Diagnosing Learning Problems: Diagnostic assessment is used at an individual level rather than a group level. The results of a diagnostic assessment are used to identify specific misconceptions a student may have or provide reasons why they failed to accomplish a specific task (got the question wrong). The results of a diagnostic assessment are used to provide detailed feedback to students – not just that they got a question wrong but also why they may have answered a question incorrectly or unsuccessfully completed a task.

Study Guides: Research has shown that using tests can be an effective study technique (Karpicke & Blunt, 2011). For example, taking a test-your-understanding quiz can help students improve their retention and recall of information. The results can provide valuable feedback for students helping them identify areas where they need to improve. In addition, taking practice tests can reduce test anxiety as students become more comfortable with the testing process and the types of items used in an assessment.

Evaluating Program Effectiveness: The results of assessments can be used to evaluate the effectiveness of educational programs and initiatives, helping teachers and schools make data-driven decisions about improving the education they provide. However, when evaluating a program, assessment results are but one piece of evidence that should be considered.

Instructional Designers will need to create assessments for several of these purposes. This may include creating a test your understanding quiz, a unit review, or a summative assessment at the end of the course to certify a student has accomplished the expected learning outcomes. Unfortunately, not all assessments are valid measures of what they intend to measure, and the results cannot be used for their intended purpose. This is why an instructional designer needs to learn how to create learning objectives and quality assessment instruments that align with the goals of the instruction.

<Transition to understanding the historical reasons for why we have assessments.>

Background of Assessment in Instructional Design

The field of instructional design began to emerge in the mid-1900s. The military was the first to design instruction systematically; they needed to quickly and efficiently train soldiers to perform specific tasks. An essential aspect of the military's training was the assessment of a soldier's aptitude and ability to correctly carry out what they had learned. Over the next few decades, an Instructional Systems Design (ISD) approach was adopted by most instructional designers. The main goal of ISD was to outline key steps that should be taken to ensure that quality instruction was created.

In the 1970s, the ADDIE model for designing and developing instruction was one of the first formal ISD models developed – reportedly by the Center for Educational Technology at Florida State University for the United States Armed Forces. ADDIE stands for Analyze, Design, Develop, Implement, and Evaluate. The analysis phase of the Addie model required a gap or needs analysis to determine the goals and objectives of the instruction to be developed. The original purpose of the evaluation phase in the ADDIE model focused on assessing student learning to determine whether the learning objective of the course had been met. The results of a summative assessment were used to certify that students had accomplished the intended learning objectives and were the main criteria used to determine the effectiveness of the instruction. However, the purpose of evaluation in the model was later expanded to include a more comprehensive view of evaluation that included formative evaluations of the instructional approach, design, usability, and maintenance of the instructional product.

The ADDIE model is arguably the most prominent instructional design model developed, but many others have since been developed and promoted. There are differences in the models, but there are three broad activities an instructional designer must accomplish:

1) Establish the learning objectives for the instruction.

2) Decide how to assess the expected learning outcomes.

3) Design and develop instructional activities to facilitate the desired learning.

Wiggins and McTighe (2005) popularized this idea by coining the term Backward Design or starting with the end in mind. Their book Understanding by Design includes the following steps: Identify the desired results, determine acceptable evidence that the expected learning outcomes have been met, then plan learning experiences and instruction to facilitate the expected learning. This approach of establishing learning objectives and creating assessments before creating learning activities was not a new concept, but Wiggins and McTighe effectively rebranded the ideas of Tyler, Gagné , Mager, and others – concepts that were the foundation of most ISD models developed in the 1950s and 1960s. As a result of Wiggins and Mctighe's work, present-day educators and instructional designers have been reintroduced to these critical concepts. 

Background of Assessment in Education

<Would this be different that the section above? I think it would be worth mentioning the development of standardized testing, etc.>

Challenges and Issues

Assessment specialists face many challenges when creating valid assessments. We have outlined a few here, but there are others.

Getting beyond recall and understanding. One of the biggest mistakes test creators make is focusing too heavily on the recall of basic information. This may be acceptable when a course's learning objective intentionally focuses on the ability to remember and understand facts and definitions; however, in many courses, the instructional objectives attempt to measure student learning beyond the initial level of Bloom's Taxonomy.   

<Image Citation>

Measuring affective characteristics. Most of what we measure in schools and training situations falls within the cognitive domain. However, often the instructional goals of a course may include affective objectives. Unlike knowledge, skills, and abilities, the affective domain includes personal characteristics like attitudinal dispositions, values, beliefs, and opinions (e.g., interest, caring, empathy, and appreciation) (see Davies, 2021). Simon and Binet (1916), the fathers of intelligence testing, suggest that as important as assessing cognitive ability may be, we might be well served first to teach (and assess) character. Assessing these personal characteristics required a different kind of assessment. It requires we create a scale that measures the degree to which individuals possess a certain characteristic or quality. 

High-stakes testing. One particularly contentious issue in schools is the political mandate to test students using standardized, summative assessments. A few issues arise from this policy. One issue with high-stakes testing revolves around the idea that these tests don't assess the whole person. The "whole person issue" in assessment refers to the challenge of capturing person's entire range of abilities, characteristics, and experiences in a comprehensive and accurate manner. Using a single assessment to judge a person may be limiting. A second issue focuses on balancing the need to assess with the need to teach. This can be problematic. Some educators complain they are so focused on testing that they have little time to teach. This includes the problem of teaching to the test. One additional issue with high-stakes testing relates to the need for such testing. Many educators believe that the most important purpose for assessment in schools is formative, not summative.  

Interpretation and inappropriate uses of assessment results. The inappropriate use of assessment results can also be a problem. Assessments are typically created for a specific purpose, and the results are not valid for other purposes. Assessment results are only valid if appropriately interpreted and used for the assessment's intended purpose. For example, in schools, test scores are designed to evaluate individual students' knowledge, skills, and abilities. Unfortunately, they are also inappropriately used to judge the quality of the instruction provided. While the quality of the teacher or instruction may influence the results of an assessment, many students fail to achieve despite being provided quality instruction. Often, students succeed despite their teachers' failings. A better assessment of teacher quality would require assessments explicitly designed for that purpose. Another example of inappropriate use of assessment results happens when we don't have a good measure of the intended learning outcomes. This can happen, for example, when we want to develop a specific affective characteristic but don't have a valid measure of the disposition—using an achievement test as an indirect substitute indicator would not be appropriate or valid practice. The challenge for assessment developers is to create direct valid measures of the expected learning outcomes.



Chapter Summary

  • In the field of instructional design, the

Discussion Questions

  1. Consider a

References

Bloom, B. S.; Engelhart, M. D.; Furst, E. J.; Hill, W. H.; Krathwohl, D. R. (1956). Taxonomy of educational objectives: The classification of educational goals. Vol. Handbook I: Cognitive domain. New York: David McKay Company.

Davies, R. (2021). Establishing and Developing Professional Evaluator Dispositions. Canadian Journal of Program Evaluation, 35 (3).

Davies, R. & Murff, M. (2024) Using Generative AI and GPT chatBots to Improve Instruction, INTED2024 Proceedings, pp. xx-xx.

Gagné, R. M. (1965). The conditions of learning (1st ed.). New York: Holt, Rinehart & Winston.

Mager, R.F. (1984). Preparing instructional objectives. (2nd ed.). Belmont, CA: David S. Lake.

Simon, T., & Binet, T. (1916). The development of intelligence in children. Translated by Elizabeth S. Kite. Vineland, The Training School, publication, (11), 336.

Tyler, R. W. (2013). Basic principles of curriculum and instruction. In Curriculum studies reader E2 (pp. 60-68). Routledge.

Wiggins, G., & McTighe, J. (2005) Understanding by design (2nd ed.). Alexandria, VA: Association for Supervision and Curriculum Development ASCD. Colomb. Appl. Linguist. J., 19(1), pp. 140-142.

Young, J., Davies, R., Jenkins J., & Pfleger , I. (2019). Keystroke Dynamics: Establishing Keyprints to Verify Users in Online Courses. Computers in the Schools. 36(1). 1-21.

This content is provided to you freely by EdTech Books.

Access it online or download it at https://edtechbooks.org/Assessment_Basics/histories_and_foundations_of_assessment.