EVALUATION OF STUDENT ACHIEVEMENT: METHODS APPROPRIATE TO THE INCORPORATION OF THE "LEARNING PARADIGM" IN HIGHER EDUCATION
by Randall Burks © 1999
ABSTRACT
This paper focuses on how the individual professor would assess student learning in his or her own classroom while attempting to operate under the Learning Paradigm, although Barr and Tagg actually see the Learning Paradigm as a framework that must be school-wide, not just found in an individual classroom. Assuming, though, that most higher education institutions will be reluctant to wholesale shift to such a paradigm, we can envision how an instructor should operate under a new form of evaluation that must necessarily build upon traditional measurement and evaluation techniques originally hammered out under the traditional Instruction Paradigm. But "[f]rom the point of view of the Learning Paradigm, these Instruction Paradigm teaching and learning structures present immense barriers to improving student learning and success. They provide no space and support for redesigned learning environments or for experimenting with alternative learning technologies. They don't provide for, warrant, or reward assessing whether student learning has occurred or is improving." So wrote Barr and Tagg in their article, "From Teaching to Learning." Most works on measurement, evaluation, and grading have been produced as a corollary to the traditional Instruction Paradigm. If higher education is to adopt the Learning Paradigm, they will also have to adopt more suitable methods of evaluating the learning that is taking place. These methods should emphasize Authentic Assessment and should exhibit an appreciation of multiple intelligences. However, since some instructors have little or no formal training in traditional measurement and evaluation theory, it is also useful to review the best practices of traditional measurement and evaluation, indicating which techniques are paradigm-specific, and which ones are universal principles. In general, it is evident that most traditional techniques should be retained and adapted, while providing a conceptual framework upon which the new paradigm can be built. Certain factors such as validity and reliability are universally necessary, regardless of the specific form of evaluation carried out under various paradigms. Suggestions are made for further research on developing a new paradigm of learning evaluation to accompany the new paradigm of learning.
INVESTIGATION AND FINDINGS
The confusing nature of measuring learning is poignantly reflected in a 1972 "Peanuts" cartoon. Lucy complains,
In the Learning Paradigm, . . . a college's purpose is not to transfer knowledge but to create environments and experiences that bring students to discover and construct knowledge for themselves, to make students members of communities of learners that make discoveries and solve problems. The school aims, in fact, to create a series of ever more powerful learning environments.
Under this paradigm, a school does not focus so much on instruction as on the real goal of teaching: learning. The best methods of demonstrating that such learning has taken place must be an integral part of the process of learning. They further explain
The Learning Paradigm prescribes no one "answer" to the question of how to organize learning environments and experiences. It supports any learning method and structure that works, where "works" is defined in terms of learning outcomes, not as the degree of conformity to an ideal classroom archetype. In fact, the Learning Paradigm requires a constant search for new structures and methods that work better for student learning and success, and expects even these to be redesigned continually and to evolve over time.
As such, assessment/ evaluation/measurement should actually assist in the learning experience, and the learning experience should assist in the evaluation of its success. For purposes of this paper, we will focus on the issues surrounding the assignment of grades representing the mastery of an academic course. Although their article did not focus on the aspect of evaluation very extensively, Barr and Tagg did include a table which is helpful illustrating the suggested differences and giving some direction for future development:
|
THE INSTRUCTION PARADIGM • Covering material • End-of-course assessment • Grading within classes by instructors • Private assessment • Degree equals accumulated credit hours |
THE LEARNING PARADIGM • Specified learning results • Pre/during/post assessments • External evaluations of learning • Public assessment • Degree equals demonstrated knowledge and skills |
Barr and Tagg explain that
The effectiveness of the assessment system for developing alternative learning environments depends in part upon its being external to learning programs and structures. While in the Instruction Paradigm students are assessed and graded within a class by the same instructor responsible for teaching them, in the Learning Paradigm, much of the assessment would be independent of the learning experience and its designer, somewhat as football games are independent measures of what is learned in football practice. Course grades alone fail to tell us what students know and can do; average grades assigned by instructors are not reliable measures of whether the instruction is improving learning.
Thus, Barr and Tagg envision that assessment would mainly take place at the end of the overall learning experience: at graduation. And this assessment process would eventually result in transforming the school into offering better learning environments, such that the desired outcomes would appear. However, this paper focuses more on what an individual professor can do to provide accurate assessment within the classroom while attempting to adopt some of the Learning Paradigm techniques and philosophies, since most postsecondary institutions apparently are not going to shift to the Learning Paradigm anytime soon.
We begin with a review of the issues underlying how learners should be evaluated in general. The answer depends partly on how the learning was intended to take place. Even within the traditional Instruction Paradigm, there are so many conflicting opinions and approaches currently practiced that both instructors are often confused about how to measure and evaluate the learning process. The issues are muddled further when we approach the post-secondary arena, with its wide variety of applications--college, continuing education, remedial classes and ESOL, vocational, professional, community and recreational classes, etc.--and each of these may actually need a proprietary approach. Thus, as Cafarella mentions in Ethical Issues in Adult Education, "a number of ethical issues and problems can arise in the implementation of this process" of designing and conducting appropriate evaluation and assessment of learning (Brockett, 1988, p. 111).
One of the deficiencies of grading practices in general is "the lack of sufficient, relevant, and objective evidence to use as a basis for assigning grades" (Ebel & Frisbie, 1986, p. 247). However, that statement can be misinterpreted and misapplied, leading instructors to focus too much on frequent standardized testing. We shall see that a greater focus on practical application would be a better approach. We will examine general guidelines for measurement and evaluation offered by the experts, and how the goals of the Learning Paradigm and Authentic Assessment fit in with these traditional guidelines?
First of all, the outcomes reflected in the grade should be relevant to the curricular goals of the particular school in question (Mehrens & Lehmann, 1991, pp. 483, 492; Tollman, 1975, p. 168). Under the Learning Paradigm, the goals as stated by Barr and Tagg are to ensure that the students learn, not just that they have heard lectures and are able to temporarily memorize some material for a test. Thus, the Authentic Assessment methods, including the use of task-accomplishment checklists, performance rating charts, actual production of projects, performances, and portfolios, would comprise a useful method of showing that the learning has taken place.
Second, the mark should reflect some verifiable level of achievement of observable (behavioral) objectives, for the simple fact that these are the only kind of objectives that can really be directly measured. As Guerin and Maier (1983) point out, "Comments about the cognitive, affective, and motivational elements of performance tend to be based on data obtained less directly, and are subject to errors in interpretation, conjecture, and speculation," but behavioral objectives "address the action of the student and provide observable indices of the condition(s) under consideration," and "once the problem has been clearly stated in behavioral [observable] terms," then objective evaluation can take place (p. 134-135). There is absolutely no conflict here between the traditional advice and the Learning Paradigm.
Note that, under the traditional Instruction Paradigm, subjective comments can be useful for guidance and recommending medial work but not for grading: "Instead, letter grades should be supplemented by . . . . [a separate report of] such factors as effort, attitude, work habits, and personal-social characteristics [which] can be reported on separately," (Gronlund & Linn, 1990, p. 429) and which can offer direction for special assistance, intervention, or remedial work. But "only by making the letter grade as pure a measure of achievement as possible and reporting on these other aspects separately can we hope to improve our descriptions of pupil learning and development" (Gronlund & Linn, 1990, p.437). However, it seems reasonable that in the Learning Paradigm, subjective comments would be useful for some types of grading, insofar as the learning outcomes would have a subjective component, such as drama performance.
Third, the grade should arise from an evaluation/reporting system that has a desirable effect on students' learning. For example, a learning experience using sports/game model with a score based on points earned in the game would likely increase motivation to learn, partly in order to gain those points and attempt to achieve a winning status. But, as a nonexample, the pass/fail (advance/review) system has been shown to decrease motivation to learn (Merhens & Lehmann, 1991, pp. 487, 494).
Fourth, traditional experts discourage using extra credit because it causes the instructor to end up evaluating students on an unequal basis since not all of them do the same work (Thorndike, Hagen, Thorndike, and Cunningham, 1991, p. 177). This concern may be partially irrelevant within the Learning Paradigm, because the goals of the paradigm tend to be met more by completion of projects, portfolios, games, simulations, cooperative learning, teamwork, and so on rather than the mere accumulation of points on written quizzes, tests, and homework. Under the Learning Paradigm, instructors act as facilitators of learning and use "specified learning results,"
"pre/during/post assessments," "external evaluations of learning," and "public assessment," such that learning and evaluation go hand-in-hand throughout the process; thus the built-in evaluation opportunities have a formative value rather than just summative. Notice that the next suggestion does not in any way diminish the need of keeping a marriage of learning and evaluation; the "end product" refers to the planning that the instructor invests in the process; it does not imply that only summative evaluation should be used for rating scales. In fact, Barr and Tagg report, that, in the Learning Paradigm, " we would assess student learning routinely and constantly."
Fifth, any components based on rating scales of attainment should
be the end product of testing and controlled observation, rather than depend on snap judgments based on hazy recollections of incidental happenings. . . . The items included in the final report form should be those on which teachers can obtain reasonably reliable and valid information. (Gronlund & Linn, 1990, p. 435)
Mehrens and Lehmann (1991) recommend reporting detailed components of progress (toward the various course objectives) in the form of rating scales, reported in addition to the letter grade (p. 492). Since Authentic Assessment concepts focus on the practical demonstration of knowledge applied to realistic, authentic situations, rating scales or rubrics for these "performances" are a must. Even in the Learning Paradigm, the students' performance should be reported accurately, not via "hazy recollections of incidental happenings." The difference is that the Learning Paradigm includes rating from outside observers, not only the instructor, and the assessment may be publicly-based rather than classroom-bound.
Sixth, grades should be honest, accurate, candid reports even in cases of poor progress:
Avoiding the anguish of assigning grades by only giving high grades is a dereliction of duty. Nothing is more damaging to [constituent-school] rapport than to have [the parties] erroneously believe that [the student] has no academic problems. The school does no one a favor by shielding a [student] in this manner. (Thorndike, et al., 1991, p. 176)
Even in the Learning Paradigm, where the school is conscientiously focused on the goal of learning and is willing to try almost any reasonable means to ensure that learning occurs, it is possible that some intervening circumstances could prevent the desired learning from occurring. If this happens, the traditional wisdom expressed by Thorndike would probably hold true: The student should not be led to believe that the required learning has taken place if it has not. On the other hand, in the Learning Paradigm, the instructor is constantly trying to determine how to create a better learning environment—one where all students can succeed. Barr and Tagg refer to a group of instructors who learned new techniques in an classroom assessment seminar: "Given information that their students were not learning, it was obvious to these teachers that something had to be done about the methods they had been using."
Seventh, the grade should be based on a combination of data about attainment of the various course objectives, weighted according to the estimated relative importance of each objective. Thus, a grade in Spanish might reflect both tests of knowledge (vocabulary, grammar) and ratings on performance skills (listening comprehension, pronunciation), with each factor weighted according to its importance to the overall instructional goals (Gronlund & Linn, 1990, p. 437.) In the Learning Paradigm, the instructor definitely has certain outcomes in mind, and it is still appropriate to weight them according to importance, especially if the school requires a letter grade to be reported.
Once the teacher has tested certain components, these will have to be combined into one letter grade. Each component should be weighted according to its importance. Unfortunately, most teachers, including many even at the university level, have been using a technique which is absolutely backwards from the scientifically accurate procure for weighting grading components. This is an extremely important procedure that must be done correctly, or students' grades will be mangled! And many have been by teachers following a procedure that seems right but is definitely wrong.
For example, if the teacher wants to weight the final exam twice as much as the midterm, then he will probably just multiply the raw points (and points possible) by 2, and then add the midterm and doubled final scores together.
However, the correct procedure involves looking at the range of scores obtained. If the students' scores on the midterm ranged from 10 to 50, and the final-exam scores ranged from 80 to 100, then the range of midterm scores is 40 (50-10) and the range of final-exam scores is 20 (100-80). In this case, doubling the final-exam scores would only equate the weights (since 20 x 2 = 40). In this particular case, then, the final-exam scores would have to be multiplied by four, not two, in order to make the final weight double that of the midterm (the range of the final would then be 80, which is twice that of the midterm-40 range).
Notice, if the midterm range had been 10 originally and the final range had been 40 originally, then the MIDTERM scores would all have to be multiplied by 2, to make the range 20, which is half of the final-exam range of 40, and thus the final exam would have twice the weight of the midterm. This correct procedure was the opposite of many teacher's procedure of automatically doubling the final-exam scores! (Dembo, 1991, pp. 564-565; Gronlund & Linn, 1990, pp. 438-439). The important principle is to equate the ranges before weighting. Only when the teacher has equated the ranges will it be clear which assignments and tests should be multiplied by what factors to achieve the desired weights.
Assuming that an individual instructor is operating within a classroom that is still steeped in the traditional Instruction Paradigm and requires traditional letter grades for report cards, the instructor will be obliged to use valid statistical techniques as explained in the previous paragraphs, even though he or she is attempting to operate under the Learning Paradigm in terms of creating learning environments.
Eighth, whatever system is used, it must be clearly understood by both the professor and the student. "The students must clearly understand what is expected of them in terms of performance--no guessing should be involved (this may mean re-writing the objectives [in simpler language] for the students)" (Tillman, 1975, p. 169).
Barr and Tagg make reference to specific learning outcomes in the Learning Paradigm. They explain that "[l]earning outcomes and standards thus would be identified and held to for all students—or raised as learning environments became more powerful—while the time students took to achieve those standards would vary." The issue of clarifying those learning outcomes remains salient under this paradigm.
Ninth, the grade should represent achievement of at certain level, period, rather than representing "effort" as supposedly determined by a comparison of the student's aptitude with his output. Some teachers want to consider effort because they think it allows for individual differences. They might say, for example, "Johnny tried hard and achieved 70% whereas Tom didn't try very hard and still achieved 70%. So I think Johnny should get a higher grade because he tried harder." However, solid experts in the field of educational measurement and evaluation state that this would only result in confusion and unreliability, because (1) effort can easily be faked, (2) both aptitude and achievement are very fallible measures to start with, and a comparison of them is generally unreliable, (3) "students must learn to realize that real life achievement is generally more important than effort or interest" (4) and "giving the same mark for different levels of performance obscures rather than accounts for differences" (Mehrens & Lehmann, 1991, pp. 484, 486 [emphasis added]).
This traditional wisdom remains true under the Learning Paradigm, as well. Barr and Tagg state:
Learning outcomes and standards . . . would be identified and held to for all students—or raised as learning environments became more powerful—while the time students took to achieve those standards would vary. This would reward skilled and advanced students with speedy progress while enabling less prepared students the time they needed to actually master the material. By "testing out," students could also avoid wasting their time being "taught" what they already know. Students would be given "credit" for degree-relevant knowledge and skills regardless of how or where or when they learned them.
In the Learning Paradigm, then, a degree would represent not time spent and credit hours dutifully accumulated, but would certify that the student had demonstrably attained specified knowledge and skills. Learning Paradigm institutions would develop and publish explicit exit standards for graduates and grant degrees and certificates only to students who met them.
Tenth: in general, the grade should reflect present status of achievement rather than growth (improvement). Thorndike, et al. (1991), agree with this but also point out that grading is especially difficult in classes in which final level of performance is inherently dependent upon the incoming ability status. They admit that this situation
is particularly noticeable in courses like art, music, and foreign languages. . . . On the first day of class, there will be some students who can perform at a level that far exceeds that which other students with less ability may ever achieve. With minimal effort, they can remain ahead of other students who work much harder. There is an understandable reluctance to ignore the degree of improvement or effort that students with less ability may exhibit over the year. The student gifted in art who does nothing for a semester may not deserve an A, despite the fact that at the end of the course he or she can produce the best project. Conversely, it seems only fair to give credit to the student who starts with little ability but who, by expending great effort, shows considerable improvement. (p. 179)
It is unclear if they are recommending this system or if they are simply describing certain teachers' opinions. Either way, improvement "is hard to measure. Change scores are notably unreliable. Regression effects result in an advantage to students with initially low scores. Students soon learn to fake ignorance [or incompetence] on pretests to present an illusion of great growth" (Mehrens & Lehmann, 1991, p. 486). Furthermore, it is likely that "growth is defined in terms of the teacher's perception of changes in the performance of students. This subjective approach" has the undesirable feature that "teacher prejudice and bias regarding specific students may enter into the evaluation process" (Terwilliger, 1971, p. 39).
The expert also points out that "the students who come into the classroom with the most knowledge or skill (probably those who are the most able and are highly motivated) will be penalized if they are judged on the basis of demonstrated growth." Likewise, "a student may have shown little growth but still be outstanding. Conversely, a student may have made great progress and still be only average." So he concludes, "despite its intuitive appeal and apparent advantages, growth [improvement] is a logically and technically unsound basis for assigning grades to students" (Terwilliger, 1971, p. 51). Besides, grades are popularly viewed by parents, corporate employers, and college/professional-school admissions committees as certifications of levels of mastery. A high grade for a much-improved but still-unskillful student would be of doubtful value to prospective employers or colleges (Thorndike, et al., 1991, p. 176).
It is understandable that some educators express concern that achievement status evaluation (as opposed to growth evaluation) "seems to condemn some students to low grades in most subjects, semester after semester," acknowledge Ebel and Frisbie (1996). However, they contend that "the remedy probably is not to try to persuade them that their rate of growth . . . is more important than status achieved, for that is a transparent falsehood" (p. 251). What is the remedy, then?
The remedy is probably to provide varied opportunities to excel in several kinds of worthwhile activities. The planning and implementation of such efforts certainly would require an alert, versatile, and dedicated teacher. When it is accomplished, though, grading on the basis of status achieved will no longer mean that some students must always win while others must always lose. Instead some students will be able to enjoy some of the rewards of excellence in their own specialties. (p. 251-252)
The higher-ed or adult-education instructor should provide enough variety of program components in which achievement can be measured so that every normal student will have a chance to succeed in some area. It is crucial, though, that these components all be truly, logically representative of the course's instructional objectives, and not just tacked on as an afterthought (Ebel & Frisbie, 1986, p. 255).
The instructor must make sure that "the objectives are represented by assessment tasks in proportion to their importance rather than in proportion to the ease with which they can be translated into behavior" (Lehman, 1992, p. 59).
There is also a philosophical continuum between product and process. It appears more effective to emphasize process, such that many basic skills of the discipline are developed and which are transferable and applicable to any situation they might encounter in that discipline. Thus the student becomes a well-rounded informed learner instead of a shallow, limited dabbler. This distinction is comparable to the difference between an individual who can pronounce words of a foreign language versus the native speaker who really understands the words and their underlying significance.
If written tests or any other test is to be given, the factors measured must have to be taught first. Likewise, they must be tested for each individual and records must be kept if accountability is to be a reality. There is a hilarious joke with a sad message about a biology professor who tested his students on bird identification by showing them only the feet of the birds. Unfortunately, he was not measuring the knowledge that he had spent time imparting in class!
Adult and collegiate educators could take some cues from K-12 music instructors, who likewise have a difficult area to evaluate and which has many corollaries adult-education programs with their de-emphasis on grades. After reviewing various literature, including a complicated PSI (Personalized System of Instruction) approach, (I feel that Labuta's concept of testing instructed skills seems the most logical solution, because its inclusion will definitely and automatically have an effect of enhancement on the ensemble's performances as well as provide the needed variety of tasks needed to give a chance for success to students of varying abilities, as mentioned above in Ebel and Frisbie (1986, p. 255).
Labuta (1974), explains, "Components of accountable instruction include: a systems approach, behavioral objectives, assessment, criterion-referenced measurement, instructional media, teaching/learning strategies, systematic instruction, and entry and exit tests" (p. 7). This volume is filled with eye-opening charts, graphs and examples, test items, objectives, and blank forms. There are even several examples of how professional teachers have successfully put the systems approach into action in various schools. Labuta (1974) conceptualizes the accountability model as comprising four phases: Design, implementation, assessment, and recycling. He describes and shows how to use eight steps in this approach: (1) Specify behavioral objectives, (2) Develop the criterion test and entry test, (3) Select or develop procedures and media, (4) Pre-assess student competencies, (5) Try procedures and media, (6) Administer the posttest, (7) Analyze and interpret test data, and (8) Modify the instructional system on the basis of analysis (pp. 144-153).
W. Clyde Duvall (1960, pp. 54-77) presents a point system for grading which will enable the instructor to give inquirers "a clear, concise answer--not a nebulous statement that will further confuse them. A point system, if accurately administered, can furnish him with the information needed to give these clear answers" (p.55).
As quoted above from Barr and Tagg, the learning Paradigm would ensure that specific levels of achievement are attained. Thus, the basic sense of the traditional wisdom here in point 10 remains relevant. To repeat,
In the Learning Paradigm, then, a degree would represent not time spent and credit hours dutifully accumulated, but would certify that the student had demonstrably attained specified knowledge and skills. Learning Paradigm institutions would develop and publish explicit exit standards for graduates and grant degrees and certificates only to students who met them.
The best, most relevant approach to evaluation is probably found in the concept of Authentic Assessment. One of the clearest descriptions of authentic assessment is found in Thomas Armstrong’s Multiple Intelligences in the Classroom. Although a presentation of the practices of authentic assessment would be beyond the scope of this paper, it should be noted that the basic concept is that grading should be based on the demonstration of knowledge and skills in a realistic or practical production of works involving such knowledge or skills. Application and synthesis are valued above mere recall of hastily-memorized (and probably soon-forgotten) data on objective tests.
Whether the traditional techniques or the newer authentic assessment methods are employed, accountability remains indispensable. The goal of accountability is to be sure that valid objectives are actually being taught and learned in classes. The 10 aspects of valid grading presented in this paper should help any instructor to formulate a program that entails the learning of important objectives that are useful for developing well-rounded learners and are logical and defensible to the students and administration, as well as leading to more objective, valid, and reliable evaluation procedures.
IMPACT & SIGNIFICANCE OF THIS ANALYSIS
Many professors would benefit from greater knowledge and practical skills in valid practices of evaluation, whether based on the currently-prevalent Instruction Paradigm, or based on the newly-emerging Learning Paradigm. Perhaps at least a few professors have little knowledge or skill related to developing valid, reliable, and authentic assessment of learning. Even if an individual instructor attempts to incorporate or shift to the Learning Paradigm, it will generally be within the context of a school that continues to operate under the traditional Instruction Paradigm, and thus letter grades will be required for courses as usual. Therefore, a hybrid form of evaluation may be necessary.
I have suggested that the new paradigm for learning needs to be accompanied by a well-reasoned system of measurement and evaluation, based on the ten principles established by research and scholarship as outlined above, which will do justice to the benefits of the Learning Paradigm. The Learning Paradigm seeks to inspire authentic learning. The Authentic Assessment paradigm seeks to assess that learning by encouraging demonstration of the learning that has taken place. Studying and implementing these principles can at least be helpful in avoiding the problem cynically highlighted by P. L. Dressel (quoted in Mehrens & Lehmann, p. 484), who described a grade as an "inadequate report of an inaccurate judgment by a biased and variable judge of the extent to which a student has attained an undefined level of mastery of an unknown proportion of an indefinite material."
REFERENCES
Armstrong, T. (1996). Multiple intelligences in the classroom. Alexandria, VA: Assoc. for Supervision and Curricular Development.
Barr, R. & Tagg, J. (1995). "From teaching to learning: A new paradigm for undergraduate education." Change. Nov./Dec. 1995.
Brockett, R. (Ed.) (1988). Ethical issues in adult education. New York: Teachers College Press.
Dembo, M. H. (1991). Applying educational psychology in the classroom. (4th ed.). New York: Longman.
Duvall, W. C. (1960). The high school band director's handbook. Englewood Cliffs, NJ: Prentice-Hall.
Ebel, R. L., & Frisbie, D. A. (1986). Essentials of educational measurement. (4th ed.). Englewood Cliffs: Prentice-Hall.
Goldenberg, R. E. (1996). A model for assessing student outcomes in adult education [WWW document]. URL http://www2.nu.edu/nuri/llconf/confl1996/agolden.html
Gronlund, N. E., & Linn, R. L. (1990). Measurement and evaluation in teaching. (6th ed.). New York: Macmillan.
Guerin, G. R., & Maier, A. S. (1983). Informal assessment in education. Palo Alto, CA: Mayfield.
Labuta, J. A. (1972). Teaching musicianship in the high school band. West Nyack, NY: Parker.
Labuta, J. A. (1974). Guide to accountability in music instruction. West Nyack, NY: Parker.
Madsen, C. K., & Yarbrough, C. (1980). Competency-based music education. Englewood Cliffs, NJ: Prentice-Hall.
Mehrens, W. A., & Lehmann, I. J. (1991) Measurement and evaluation in education and psychology. (4th ed.). Fort Worth: Holt.
Meisels, S. J. (1996). Using work sampling in authentic assessments [WWW document] URL http://www.ascd.org/pubs/el/dec96/meisels.html
Terwilliger, J. S. (1971). Assigning grades to students. Glenview, IL: Scott, Foresman.
Thorndike, R. L., Hagen, E. P., Thorndike, R. M., & Cunningham, G. K. (1991). Measurement and evaluation in psychology and education. (5th ed.). New York: Macmillan.