Common Assessment Terms
Assessment for Accountability
The assessment of some unit, such as a department, program or entire institution, which is used to satisfy some group of external stakeholders. Stakeholders might include accreditation agencies, state government, or trustees. Results are often compared across similar units, such as other similar programs and are always summative. An example of assessment for accountability would be ABET accreditation in engineering schools, whereby ABET creates a set of standards that must be met in order for an engineering school to receive ABET accreditation status.
Assessment for Improvement
Assessment activities that are designed to feed the results directly, and ideally, immediately, back into revising the course, program or institution with the goal of improving student learning. Both formative and summative assessment data can be used to guide improvements.
Concept maps are graphical representations that can be used to reveal how students organize their knowledge about a concept or process. They include concepts, usually represented in enclosed circles or boxes, and relationships between concepts, indicated by a line connecting two concepts. Example [
Direct Assessment of Learning
Direct assessment is when measures of learning are based on student performance or demonstrates the learning itself. Scoring performance on tests, term papers, or the execution of lab skills, would all be examples of direct assessment of learning. Direct assessment of learning can occur within a course (e.g., performance on a series of tests) or could occur across courses or years (comparing writing scores from sophomore to senior year).
A means of gathering information about student learning that is integrated into the teaching-learning process. Results can be used to assess individual student performance or they can be aggregated to provide information about the course or program. can be formative or summative, quantitative or qualitative. Example: as part of a course, expecting each senior to complete a research paper that is graded for content and style, but is also assessed for advanced ability to locate and evaluate Web-based information (as part of a college-wide outcome to demonstrate information literacy).
Use of criteria (rubric) or an instrument developed by an individual or organization external to the one being assessed. This kind of assessment is usually summative, quantitative, and often high-stakes, such as the SAT or GRE exams.
Formative assessment refers to the gathering of information or data about student learning during a course or program that is used to guide improvements in teaching and learning. Formative assessment activities are usually low-stakes or no-stakes; they do not contribute substantially to the final evaluation or grade of the student or may not even be assessed at the individual student level. For example, posing a question in class and asking for a show of hands in support of different response options would be a formative assessment at the class level. Observing how many students responded incorrectly would be used to guide further teaching.
High stakes Assessment
The decision to use the results of assessment to set a hurdle that needs to be cleared for completing a program of study, receiving certification, or moving to the next level. Most often, the assessment so used is externally developed, based on set standards, carried out in a secure testing situation, and administered at a single point in time. Examples: at the secondary school level, statewide exams required for graduation; in postgraduate education, the bar exam.
Indirect Assessment of Learning
Indirect assessments use perceptions, reflections or secondary evidence to make inferences about student learning. For example, surveys of employers, students’ self-assessments, and admissions to graduate schools are all indirect evidence of learning.
Uses the individual student, and his/her learning, as the level of analysis. Can be quantitative or qualitative, formative or summative, standards-based or value added, and used for improvement. Most of the student assessment conducted in higher education is focused on the individual. Student test scores, improvement in writing during a course, or a student’s improvement presentation skills over their undergraduate career are all examples of individual assessment.
Uses the institution as the level of analysis. The assessment can be quantitative or qualitative, formative or summative, standards-based or value added, and used for improvement or for accountability. Ideally, institution-wide goals and objectives would serve as a basis for the assessment. For example, to measure the institutional goal of developing collaboration skills, an instructor and peer assessment tool could be used to measure how well seniors across the institution work in multi-cultural teams.
Means and methods that are developed by an institution's faculty based on their teaching approaches, students, and learning goals. An example would be an English Department’s construction and use of a writing rubric to assess incoming freshmen’s writing samples, which might then be used assign students to appropriate writing courses, or might be compared to senior writing samples to get a measure of value-added.
Uses the department or program as the level of analysis. Can be quantitative or qualitative, formative or summative, standards-based or value added, and used for improvement or for accountability. Ideally, program goals and objectives would serve as a basis for the assessment. Example: How well can senior engineering students apply engineering concepts and skills to solve an engineering problem? This might be assessed through a capstone project, by combining performance data from multiple senior level courses, collecting ratings from internship employers, etc. If a goal is to assess value added, some comparison of the performance to newly declared majors would be included.
Collects data that does not lend itself to quantitative methods but rather to interpretive criteria (see the first example under "standards").
Collects data that can be analyzed using quantitative methods (see "assessment for accountability" for an example).
A rubric is a scoring tool that explicitly represents the performance expectations for an assignment or piece of work. A rubric divides the assigned work into component parts and provides clear descriptions of the characteristics of the work associated with each component, at varying levels of mastery. Rubrics can be used for a wide array of assignments: papers, projects, oral presentations, artistic performances, group projects, etc. Rubrics can be used as scoring or grading guides, to provide formative feedback to support and guide ongoing learning efforts, or both.
Standards refer to an established level of accomplishment that all students are expected to meet or exceed. Standards do not imply standardization of a program or of testing. Performance or learning standards may be met through multiple pathways and demonstrated in various ways. For example, instruction designed to meet a standard for verbal foreign language competency may include classroom conversations, one-on-one interactions with a TA, or the use of computer software. Assessing competence may be done by carrying on a conversation about daily activities or a common scenario, such as eating in a restaurant, or using a standardized test, using a rubric or grading key to score correct grammar and comprehensible pronunciation.
The gathering of information at the conclusion of a course, program, or undergraduate career to improve learning or to meet accountability demands. When used for improvement, impacts the next cohort of students taking the course or program. Examples: examining student final exams in a course to see if certain specific areas of the curriculum were understood less well than others; analyzing senior projects for the ability to integrate across disciplines.
The increase in learning that occurs during a course, program, or undergraduate education. Can either focus on the individual student (how much better a student can write, for example, at the end than at the beginning) or on a cohort of students (whether senior papers demonstrate more sophisticated writing skills-in the aggregate-than freshmen papers). To measure value-added a baseline measurement is needed for comparison. The baseline measure can be from the same sample of students (longitudinal design) or from a different sample (cross-sectional).
Adapted from Assessment Glossary compiled by American Public University System, 2005
CONTACT US to talk with an Eberly colleague in person!