What is Evaluation?
A basic dictionary definition of evaluation will often state that to evaluate is to make a judgment. And, unlike other forms of inquiry, a fundamental aspect of evaluation practice requires an evaluator to make value-based judgments. Building on that idea, Michael Scriven (1991) described evaluation as determining the value, merit, and worth of something. Scriven's definition is concise and aligns well with the dictionary definition. In addition, it has been widely accepted within the field of evaluation; however, in 1994, the Joint committee on Standard in Educational Evaluation's (JCSEE) definition of evaluation added the notion that evaluation should be systematic. JCSEE's 1994 definition states that "evaluation is a systematic assessment of the worth and merit of an object" (pg. 3). While this definition excluded the word "value," evaluation's root term, value, implies that the act of evaluating (i.e., determining merit and worth) will always require some value-based criterion by which the judgment will be made (Stufflebeam & Coryn, 2014). In fact, The American Evaluation Association's (AEA) Guiding Principles for Evaluators (AEA, 2018) expressed the need for evaluators to identify and clearly communicate stakeholders' values when conducting an evaluation. Fitzpatrick et al. (2011) point out that evaluators differ in the value they assign the things they are evaluating because their criteria differ. Therefore, it is incumbent that evaluators clearly articulate the criteria by which they will base their evaluation findings. The expectation is that formal evaluations will be based on defensible criteria or clearly defined standards.
To evaluate requires us to make judgments.
Evaluation is the process of determining the merit, worth, and value of things.
Value and Valuing.
Part of the reason some choose to leave out the term "value" from the definition of evaluation has to do with the verbal association this term has with the concept of valuing and the distinction that needs to be made between something having value and one's personal values. Aside from the verbal association issue, understanding the relationship between value and values is essential. Something will have value for a specific reason given a specific context. The value (merit or worth) assigned by individuals to an object will differ depending on their values (morals, preferences, interests, goals, ethics).
Understanding this point is vital for evaluators because things rarely have intrinsic value. We all agree that life-sustaining objects like air, food, and water have intrinsic value. Having basic needs met are also considered necessary. Things like being loved, feeling that you belong, and being safe are widely valued as they are considered essential for our well-being and development (Maslow, 1970). Beyond that, things have value because they are useful or desirable to someone for some reason. In most cases, the act of valuing is personal. We decide that something has merit or worth because our morals, preferences, interests, goals, or ethics lead us to arrive at that conclusion.
Personal values influence the value (merit and worth) we place on things.
Criteria and Standards.
We can assess the value of things (i.e., evaluate) using personal criteria or some agreed-upon standard. When using personal criteria, our value assessment can be entirely subjective and potentially unreliable (inconsistent). Individuals don't always carefully consider the criteria by which they assess value. In addition, the context of the situation will influence our assessments of value. Failure to identify and use appropriate criteria may render our evaluation results invalid (i.e., inaccurate in terms of the object's actual value). In many cases, the consequence of making a poor evaluation is minor. However, some evaluations we make have higher stakes; in these situations, the consequences of obtaining inaccurate evaluation findings can be costly.
We set standards (agreed-upon criteria) to reduce the subjectivity of our value assessments. If no standards exist, we need to clearly articulate the criteria or define the standard by which we will judge the evaluand's value, merit, and worth (i.e., the thing we are evaluating). Even when conducting informal personal evaluations, we would do well to identify and articulate the values by which we will make judgments.
Various types of criteria exist. An object might have value because of what we can accomplish with it (a utility or functionality criteria). Often things are valued for religious or ethical reasons (a moral or ethical criteria). Objects can have value for sentimental reasons or simply because they are attractive or interesting (a personal satisfaction or aesthetics criteria).
Formal evaluation should include defensible criteria.
People conduct evaluations every day. Most of these evaluations are informal. Some are important, and some are trivial. We consider the value, merit, or worth of various things every day; most often, we do this to help us make decisions. We might need to decide whether to have breakfast, so we consider the value of doing so. We might want to purchase an item, so we consider the item's worth in relation to the benefit we can derive from owning it. We might also consider the need to shower before going out and the merits of doing so compared to the consequences of not taking the time for personal hygiene. Our evaluations are always contextual, value-based, and influenced by personal preferences, interests, and goals.
The evaluations we will be discussing in this course are formal evaluations. Formal evaluations should be systematic, comprehensive, accurate, and ethical. Quality evaluations are based on defensible criteria and credible data collection methods; in addition, the data interpretations and the recommendations made must be deemed credible by some standard.
Evaluation and Research
When attempting to define evaluation, a distinction must inevitably be made between evaluation and research. Both are forms of inquiry and use similar methods. There are, however, a few key differences.
Purpose – one difference between research and evaluation is the reason for conduction the inquiry. The researcher's goal is to add to a field's body of knowledge, and the evaluator's goal is to provide information and recommendations to the client extensively to decide something.
Context – evaluation is conducted in a specific context, and the results may or may not be valid in other contexts. Research is meant to produce generalizable knowledge.
Investigator's Role – evaluators often work for a client as a consultant or service provider. As such, the client determines the questions and focus of the evaluation with advice from the evaluator. Researchers decide what they will study and what questions they will attempt to answer.
Quality Standards – research is considered valid if appropriate methods were used, the research controls for confounding variables, and the findings support the conclusions. Evaluations are regarded as credible when the evaluator is responsive to the needs of stakeholders, uses appropriate methods and procedures, and provides recommendations that are justified by the evidence, ethical, practical, and realistic.
Training – Researchers need to be experts in their specific field; they need to be trained in the methods used within their field. Evaluator training is broader. The evaluator (or the evaluation team) needs to work collaboratively with clients (i.e., develop soft skills); they must facilitate and manage evaluation projects efficiently and competently. They must be familiar with a variety of data collection and analysis methods. Evaluators may be experts in the field, but more importantly, they must develop evaluative thinking skills and effectively (persuasively) present information in various ways.
Overlap between research and evaluation is common. Evaluators will use research findings to inform their evaluation efforts; and, evaluation research is conducted to provide generalizable information and recommendations. The main difference between research findings and evaluation findings is the value-based judgments made by evaluators. Researchers attempt to be objective and present factual information, whereas evaluators need to provide an opinion (i.e, make a judgment) about the factual information they obtain.
For example, a research study may determine that the average achievement of sixth-grade students at one school was statistically different from the average performance of similar students elsewhere. They might also calculate the effect size (i.e., practical significance). In research, obtaining a statistically significant result means the observed difference was not likely due to chance. In contrast, the practical significance of an observed difference estimates the mathematical magnitude of that difference. While these findings represent factual information, they are not evaluations. These results would be categorized as descriptive. There is no judgment made nor opinion given about the acceptability of the individual student's performance. An evaluation would require a judgment be made about the results based on some criteria. Did the students do admirable or abysmally? Based on this information, what recommendations are appropriate? Sometimes research provides these kinds of evaluative opinions, but usually not. Evaluation will always provide a value-based judgment of some kind.
- Evaluation is the process of determining the merit, worth, and value of things.
- Objects have value because it is useful or desirable to someone for some purpose or reason.
- People differ in the value, merit, and worth they assign to things because their values and the criteria they deem important differ.
- Evaluations are improved when we identify and articulate our personal values as well as the criteria and standards we will use.
- People conduct informal, personal evaluations all the time.
- To maximize their usefulness, formal evaluations should be systematic, comprehensive, accurate, and ethical. They should be based on defensible criteria and data collection methods.
- Explain the benefits of establishing clear defensible criteria to guide an evaluation. What are the likely consequences of not doing so? Provide an example.
- Consider something you value. Articulate the criteria you used to make this determination. What criteria or standard was most significant in your determination (utility, safety, cost, moral, ethical, personal satisfaction or preference, other)? What criteria, if any, did you neglect to consider?
American Evaluation Association. (2018). American evaluation association guiding principles for evaluators.
Fitzpatrick, J.L., Sanders, J.R., Worthen, B.R. (2011). Program evaluation: alternative approaches and practical guidelines. Pearson Education Inc.
Joint Committee on Standards for Educational Evaluation. (1994, 2011). The program evaluation standards: A
guide for evaluators and evaluation users (3rd ed.). Thousand Oaks, CA: Sage.
Maslow, A.H. (1970). Motivation and personality. USA: Harper & Row Publishers.
Patton, M.Q. (2008). Utilization-focused Evaluation (4th ed.). Thousand Oaks CA: Sage.
Scriven, M. (1991). Evaluation Thesaurus, 4th Edition. Sage.
Stufflebeam, D. L., & Coryn, C. L. (2014). Evaluation theory, models, and applications (Vol. 50). John Wiley & Sons.