Rich Shavelson
Margaret Jacks Professor of Education, Emeritus
Graduate School of Education
Bio
His current work focuses on the design of assessments and assessment systems that measure college students learning, both their development of competence/ achievement and so-called “soft-skills” such as perspective taking. He co-created the Collegiate Learning Assessment with Steve Klein and built statistical models for estimating value added for the CLA and other college-level assessments. This work is summarized in Measuring College Student Learning: Accountability in a New Era (2010, Stanford University Press) and in recent papers on the measurement and statistical modeling of competence.
Academic Appointments
-
Emeritus Faculty, Acad Council, Graduate School of Education
Administrative Appointments
-
I James Quillan Dean, School of Education (1995 - 2000)
-
Emeritus Professor of Psychology (by courtesy), School of Humanities and Sciences (1995 - Present)
-
Professor Emeritus, Stanford Graduate School of Education (1995 - Present)
-
Emeritus Affiliated Faculty, Stanford Institute for the Environment (2005 - 2007)
Honors & Awards
-
Humboldt Fellowship, Humboldt Foundation, Germany (1994)
-
AERA Review of Research Award, American Educational Research Association (2008)
-
AERA Review of Research Award, American Educational research Association (1978)
-
E.F. Lindquist Award, American Educational Research Association (2011)
-
R.L. Linn Award, American Educational Association (2016)
-
E.L. Thorndike Award, American Psychological Association (2010)
Boards, Advisory Committees, Professional Organizations
-
Assistant Professor of Education, UCLA (1973 - 1975)
-
Associate Professor of Education, UCLA (1975 - 1979)
-
Professor of Education, UCLA (1979 - 1987)
-
Dean, Graduate School of Education, University of California (1987 - 1993)
-
Professor of Education, UCSB (1987 - 1996)
-
Professor of Education, UCSB with an affiliated appointment in Statistics and Applied Probability (1993 - 1996)
-
Vice Chair and Chair, Board on Testing and Assessment National Academy of Science (1993 - 1998)
-
Board Member, Yosemite National Institutes/ NatureBridge (1996 - 2008)
-
Board Member, The Spencer Foundation (1997 - 2005)
-
Education Advisory Council (Chair), NatureBridge (2009 - 2012)
-
Board Member, BSCS (formerly Biological Sciences Curriculum Study) (2009 - 2018)
-
Chair, BSCS Board (2016 - 2018)
-
Member and Vice Chair Governance Committee, Stanford Historical Society (2017 - 2019)
Professional Education
-
PhD, Stanford University, Educational Psychology (1971)
-
MA, San Jose State College, Psychology (1967)
-
BA, University of Oregon, Psychology (1964)
Research Interests
-
Assessment, Testing and Measurement
-
Higher Education
-
Psychology
-
Research Methods
Current Research and Scholarly Interests
Assessment of learning in higher education (including the Collegiate Learning Assessment); accountability in higher education; higher education policy.
2023-24 Courses
-
Independent Studies (7)
- Directed Reading
EDUC 480 (Aut, Win, Spr) - Directed Reading in Education
EDUC 180 (Aut, Win, Spr) - Directed Research
EDUC 490 (Aut, Win, Spr) - Directed Research in Education
EDUC 190 (Aut, Win, Spr) - Master's Thesis
EDUC 185 (Aut, Win, Spr) - Practicum
EDUC 470 (Aut, Win, Spr) - Supervised Internship
EDUC 380 (Aut, Win, Spr)
- Directed Reading
All Publications
-
Rescue an Enterprise from Failure: An Innovative Assessment Tool for Simulated Performance
ASSESSMENT OF LEARNING OUTCOMES IN HIGHER EDUCATION: CROSS-NATIONAL COMPARISONS AND PERSPECTIVES
2018: 123–44
View details for DOI 10.1007/978-3-319-74338-7_7
View details for Web of Science ID 000441050900008
-
Performance indicators of learning in higher education institutions: an overview of the field
RESEARCH HANDBOOK ON QUALITY, PERFORMANCE AND ACCOUNTABILITY IN HIGHER EDUCATION
2018: 249–63
View details for Web of Science ID 000447790100019
-
International Performance Assessment of Learning in Higher Education (iPAL): Research and Development
ASSESSMENT OF LEARNING OUTCOMES IN HIGHER EDUCATION: CROSS-NATIONAL COMPARISONS AND PERSPECTIVES
2018: 193–214
View details for DOI 10.1007/978-3-319-74338-7_10
View details for Web of Science ID 000441050900011
-
Gavriel Salomon: In Memoriam
EDUCATIONAL PSYCHOLOGY REVIEW
2016; 28 (2): 207–13
View details for DOI 10.1007/s10648-016-9367-1
View details for Web of Science ID 000376253000001
-
On the practices and challenges of measuring higher education value added: the case of Colombia
ASSESSMENT & EVALUATION IN HIGHER EDUCATION
2016; 41 (5): 695-720
View details for DOI 10.1080/02602938.2016.1168772
View details for Web of Science ID 000377038800004
-
The international state of research on measurement of competency in higher education
STUDIES IN HIGHER EDUCATION
2015; 40 (3): 393-411
View details for DOI 10.1080/03075079.2015.1004241
View details for Web of Science ID 000350804300002
-
Beyond Dichotomies Competence Viewed as a Continuum
ZEITSCHRIFT FUR PSYCHOLOGIE-JOURNAL OF PSYCHOLOGY
2015; 223 (1): 3-13
View details for DOI 10.1027/2151-2604/a000194
View details for Web of Science ID 000352073100002
-
On the Factorial Structure of the SAT and Implications for Next-Generation College Readiness Assessments
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT
2014; 74 (5): 859-874
View details for DOI 10.1177/0013164414528332
View details for Web of Science ID 000341865700007
-
Factors Contributing to Problem-Solving Performance in First-Semester Organic Chemistry
JOURNAL OF CHEMICAL EDUCATION
2014; 91 (7): 976-981
View details for DOI 10.1021/ed400696c
View details for Web of Science ID 000339090200007
-
Using Formal Embedded Formative Assessments Aligned with a Short-Term Learning Progression to Promote Conceptual Change and Achievement in Science
INTERNATIONAL JOURNAL OF SCIENCE EDUCATION
2014; 36 (4): 531-552
View details for DOI 10.1080/09500693.2013.787556
View details for Web of Science ID 000336504200001
-
Self-regulated learning study strategies and academic performance in undergraduate organic chemistry: An investigation examining ethnically diverse students
JOURNAL OF RESEARCH IN SCIENCE TEACHING
2013; 50 (6): 660-676
View details for DOI 10.1002/tea.21095
View details for Web of Science ID 000322006100003
-
On an Approach to Testing and Modeling Competence
EDUCATIONAL PSYCHOLOGIST
2013; 48 (2): 73-86
View details for DOI 10.1080/00461520.2013.779483
View details for Web of Science ID 000317898700001
-
Context matters: volunteer bias, small sample size, and the value of comparison groups in the assessment of research-based undergraduate introductory biology lab courses.
Journal of microbiology & biology education : JMBE
2013; 14 (2): 176-182
Abstract
The shift from cookbook to authentic research-based lab courses in undergraduate biology necessitates the need for evaluation and assessment of these novel courses. Although the biology education community has made progress in this area, it is important that we interpret the effectiveness of these courses with caution and remain mindful of inherent limitations to our study designs that may impact internal and external validity. The specific context of a research study can have a dramatic impact on the conclusions. We present a case study of our own three-year investigation of the impact of a research-based introductory lab course, highlighting how volunteer students, a lack of a comparison group, and small sample sizes can be limitations of a study design that can affect the interpretation of the effectiveness of a course.
View details for DOI 10.1128/jmbe.v14i2.609
View details for PubMedID 24358380
-
Understanding Academic Performance in Organic Chemistry
JOURNAL OF CHEMICAL EDUCATION
2011; 88 (9): 1238-1242
View details for DOI 10.1021/ed900067m
View details for Web of Science ID 000293813100011
-
Validating the use of concept-mapping as a diagnostic assessment tool in organic chemistry: implications for teaching
CHEMISTRY EDUCATION RESEARCH AND PRACTICE
2011; 12 (2): 133-141
View details for DOI 10.1039/C1RP90018H
View details for Web of Science ID 000289951500003
- Measuring college learning responsibly: Accountability in a new era Stanford University Press. 2010
-
Supporting Valid Interpretations of Learning Progression Level Diagnoses
JOURNAL OF RESEARCH IN SCIENCE TEACHING
2009; 46 (6): 699-715
View details for DOI 10.1002/tea.20308
View details for Web of Science ID 000268606600007
- Generalizability theory and its contribution to the discussion of the generalizability of research findings Generalizing from educational research 2009: 13-32
- The limitations of portfolios Inside Higher Ed 2009; 16
- Supporting valid interpretations of learning progression level diagnoses Journal of Research in Science Teaching: The Official Journal of the National Association for Research in Science Teaching 2009; 46 (6): 699-715
- Direct measures in environmental education evaluation: Behavioral intentions versus observable actions Applied Environmental Education and Communication 2009; 8 (3-4): 165-173
-
Assessing School Effectiveness
EVALUATION REVIEW
2008; 32 (6): 511-525
Abstract
The Collegiate Learning Assessment (CLA) program measures value added in colleges and universities, by testing the ability of freshmen and seniors to think logically and write clearly. The program is popular enough that it has attracted critics. In this paper, we outline the methods used by the CLA to determine value added. We summarize the criticisms, which revolve around the question of which students take the CLA tests. Typically, samples are not random, so that selection bias is a concern, as is confounding. We respond by showing that criticisms of CLA procedures are not supported by the data.
View details for DOI 10.1177/0193841X08325948
View details for Web of Science ID 000260738700001
View details for PubMedID 18981333
-
Measuring Knowledge Structure: Reliability of Concept Mapping Assessment in Medical Education
ACADEMIC MEDICINE
2008; 83 (12): 1196-1203
Abstract
To test the reliability of concept map assessment, which can be used to assess an individual's "knowledge structure," in a medical education setting.In 2004, 52 senior residents (pediatrics and internal medicine) and fourth-year medical students at the University of California-Davis School of Medicine created separate concept maps about two different subject domains (asthma and diabetes) on two separate occasions each (four total maps). Maps were rated using four different scoring systems: structural (S; counting propositions), quality (Q; rating the quality of propositions), importance/quality (I/Q; rating importance and quality of propositions), and a hybrid system (H; combining elements of S with I/Q). The authors used generalizability theory to determine reliability.Learners (universe score) contributed 40% to 44% to total score variation for the Q, I/Q, and H scoring systems, but only 10% for the S scoring system. There was a large learner-occasion-domain interaction effect (19%-23%). Subsequent analysis of each subject domain separately demonstrated a large learner-occasion interaction effect (31%-37%) and determined that administration on four to five occasions was necessary to achieve adequate reliability. Rater variation was uniformly low.The Q, I/Q, and H scoring systems demonstrated similar reliability and were all more reliable than the S system. The findings suggest that training and practice are required to perform the assessment task, and, as administered in this study, four to five testing occasions are required to achieve adequate reliability. Further research should focus on whether alterations in the concept mapping task could allow it to be administered over fewer occasions while maintaining adequate reliability.
View details for Web of Science ID 000267654800030
View details for PubMedID 19202500
-
Application of Generalizability theory to concept map assessment research
APPLIED MEASUREMENT IN EDUCATION
2008; 21 (3): 273-291
View details for DOI 10.1080/08957340802161840
View details for Web of Science ID 000258330600005
- Teachers’ decision making: From Alan J. Bishop to today Critical issues in mathematics education Springer. 2008: 37–67
- Reflections on quantitative reasoning: An assessment perspective Calculation vs. context: Quantitative literacy and its implications for teacher education 2008: 27-47
- Application of generalizability theory to concept map assessment research Applied Measurement in Education 2008; 21 (3): 273-291
- Measuring knowledge structure: Reliability of concept mapping assessment in medical education Academic Medicine 2008; 83 (12): 1196-1203
- Assessing school effectiveness Evaluation Review 2008; 32 (6): 511-525
-
On the Impact of Formative Assessment on Student Motivation, Achievement, and Conceptual Change
APPLIED MEASUREMENT IN EDUCATION
2008; 21 (4): 335-359
View details for DOI 10.1080/08957340802347845
View details for Web of Science ID 000261935300004
-
On the Fidelity of Implementing Embedded Formative Assessments and Its Relation to Student Learning
APPLIED MEASUREMENT IN EDUCATION
2008; 21 (4): 360-389
View details for DOI 10.1080/08957340802347852
View details for Web of Science ID 000261935300005
-
On the Impact of Curriculum-Embedded Formative Assessment on Learning: A Collaboration between Curriculum and Assessment Developers
APPLIED MEASUREMENT IN EDUCATION
2008; 21 (4): 295-314
View details for DOI 10.1080/08957340802347647
View details for Web of Science ID 000261935300002
-
From Formal Embedded Assessments to Reflective Lessons: The Development of Formative Assessment Studies
APPLIED MEASUREMENT IN EDUCATION
2008; 21 (4): 315-334
View details for DOI 10.1080/08957340802347787
View details for Web of Science ID 000261935300003
-
Teaching effectiveness research in the past decade: The role of theory and research design in disentangling meta-analysis results
REVIEW OF EDUCATIONAL RESEARCH
2007; 77 (4): 454-499
View details for DOI 10.3102/0034654307310317
View details for Web of Science ID 000251285200002
-
The collegiate learning assessment - Facts and fantasies
EVALUATION REVIEW
2007; 31 (5): 415-439
Abstract
The Collegiate Learning Assessment (CLA) is a computer administered, open-ended (as opposed to multiple-choice) test of analytic reasoning, critical thinking, problem solving, and written communication skills. Because the CLA has been endorsed by several national higher education commissions, it has come under intense scrutiny by faculty members, college administrators, testing experts, legislators, and others. This article describes the CLA's measures and what they do and do not assess, how dependably they measure what they claim to measure, and how CLA scores differ from those on other direct and indirect measures of college student learning. For instance, analyses are conducted at the school rather than the student level and results are adjusted for input to assess whether the progress students are making at their school is better or worse than what would be expected given the progress of "similarly situated" students (in terms of incoming ability) at other colleges.
View details for DOI 10.1177/0193841X07303318
View details for Web of Science ID 000249431100001
View details for PubMedID 17761805
-
Criterion-based training with surgical simulators: proficiency of experienced surgeons.
JSLS : Journal of the Society of Laparoendoscopic Surgeons / Society of Laparoendoscopic Surgeons
2007; 11 (3): 273-302
Abstract
In our effort to establish criterion-based skills training for surgeons, we assessed the performance of 17 experienced laparoscopic surgeons on basic technical surgical skills recorded electronically in 26 modules selected in 5 commercially available, computer-based simulators.Performance data were derived from selected surgeons randomly assigned to simulator stations, and practicing repetitively during one and one-half day sessions on 5 different simulators. We measured surgeon proficiency defined as efficient, error-free performance and developed proficiency score formulas for each module. Demographic and opinion data were also collected.Surgeons' performance demonstrated a sharp learning curve with the most performance improvement seen in early practice attempts. Median scores and performance levels at the 10th, 25th, 75th, and 90th percentiles are provided for each module. Construct validity was examined for 2 modules by comparing experienced surgeons' performance with that of a convenience sample of less-experienced surgeons.A simple mathematical method for scoring performance is applicable to these simulators. Proficiency levels for training courses can now be specified objectively by residency directors and by professional organizations for different levels of training or post-training assessment of technical performance. But data users should be cautious due to the small sample size in this study and the need for further study into the reliability and validity of the use of surgical simulators as assessment tools.
View details for PubMedID 17931510
- The collegiate learning assessment: Facts and fantasies Evaluation Review 2007; 31 (5): 415-439
- Estimating causal effects using experimental and observational design American Educational & Reseach Association. 2007
- A brief history of student learning assessment: How we got where we are and a proposal for where to go next Association of American Colleges and Universities. 2007
-
Windows into the mind
HIGHER EDUCATION
2005; 49 (4): 413-430
View details for DOI 10.1007/s10734-004-9448-9
View details for Web of Science ID 000231229900001
-
An approach to measuring cognitive outcomes across higher education institutions
RESEARCH IN HIGHER EDUCATION
2005; 46 (3): 251-276
View details for DOI 10.1007/s11162-004-1640-3
View details for Web of Science ID 000235624000001
-
Comparison of two concept-mapping techniques: Implications for scoring, interpretation, and use
JOURNAL OF RESEARCH IN SCIENCE TEACHING
2005; 42 (2): 166-184
View details for DOI 10.1002/tea.20049
View details for Web of Science ID 000227024700002
-
Evaluating students' science notebooks as an assessment tool
INTERNATIONAL JOURNAL OF SCIENCE EDUCATION
2004; 26 (12): 1477-1506
View details for DOI 10.1080/0950069042000177299
View details for Web of Science ID 000225202500003
-
Lee J. Cronbach.
Proceedings of the American Philosophical Society
2003; 147 (4): 379-385
View details for PubMedID 15025124
-
On the evaluation of systemic science education reform: Searching for instructional sensitivity
JOURNAL OF RESEARCH IN SCIENCE TEACHING
2002; 39 (5): 369-393
View details for DOI 10.1002/tea.10027
View details for Web of Science ID 000175090400001
-
Comparison of the reliability and validity of scores from two concept-mapping techniques
JOURNAL OF RESEARCH IN SCIENCE TEACHING
2001; 38 (2): 260-278
View details for Web of Science ID 000166742100006
-
The effect of simulator use on learning and self-assessment: The case of Stanford University's E-Pelvis simulator
Conference on Medicine Meets Virtual Reality 2001
I O S PRESS. 2001: 396–400
View details for Web of Science ID 000169103300074
View details for PubMedID 11317776
-
The effects of content, format, and inquiry level on science performance assessment scores
APPLIED MEASUREMENT IN EDUCATION
2000; 13 (2): 139-160
View details for Web of Science ID 000086300300002
-
Note on sources of sampling variability in science performance assessments
JOURNAL OF EDUCATIONAL MEASUREMENT
1999; 36 (1): 61-71
View details for Web of Science ID 000079978300004
-
On the development and evaluation of a shell for generating science performance assessments
INTERNATIONAL JOURNAL OF SCIENCE EDUCATION
1999; 21 (3): 293-315
View details for Web of Science ID 000079040100005
-
Toward a science performance assessment technology
7th EARLI Conference
PERGAMON-ELSEVIER SCIENCE LTD. 1998: 171–84
View details for Web of Science ID 000075460800005
-
Analytic versus holistic scoring of science performance tasks
APPLIED MEASUREMENT IN EDUCATION
1998; 11 (2): 121-137
View details for Web of Science ID 000072776300001
-
Gender and racial/ethnic differences on performance assessments in science
EDUCATIONAL EVALUATION AND POLICY ANALYSIS
1997; 19 (2): 83-97
View details for Web of Science ID A1997XE01200001
-
Rhetoric and reality in science performance assessments: An update
JOURNAL OF RESEARCH IN SCIENCE TEACHING
1996; 33 (10): 1045-1063
View details for Web of Science ID A1996VV66800002
-
Problems and issues in the use of concept maps in science assessment
JOURNAL OF RESEARCH IN SCIENCE TEACHING
1996; 33 (6): 569-600
View details for Web of Science ID A1996UZ55700002
-
On the structure of social self-concept for pre-, early, and late adolescents: A test of the Shavelson, Hubner, and Stanton (1976) model
JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY
1996; 70 (3): 599-613
Abstract
This study is the first to empirically validate the social self-concept component of the R. J. Shavelson, J. J. Hubner, and G. C. Stanton (1976) model. The primary purpose was to test for each of 3 age groups--preadolescents (Grade 3), early adolescents (Grade 7), and late adolescents (Grade 11)-3 hypotheses bearing on the structure of social self-concept within the context of this model: (a) that it is multidimensional, (b) that it is hierarchically ordered, and (c) that it becomes increasingly differentiated with age. Given evidence of a hierarchical social self-concept structure, a secondary focus of the study was to determine the extent to which this pattern held across age. On the basis of the analysis of covariance structures within the framework of confirmatory factor analysis, results revealed a multidimensional social self-concept structure that becomes increasingly differentiated and a hierarchical ordering that becomes better defined with age. Overall, findings were consistent with both the R. J. Shavelson et al. (1976) conceptualization of self-concept structure and developmental processes that underlie self-concept formation.
View details for Web of Science ID A1996TZ88300015
View details for PubMedID 8851744
-
ON GETTING IT RIGHT
EDUCATIONAL EVALUATION AND POLICY ANALYSIS
1995; 17 (3): 275-279
View details for Web of Science ID A1995RW98500002
-
SELF-CONCEPT - VALIDATION OF CONSTRUCT INTERPRETATIONS
REVIEW OF EDUCATIONAL RESEARCH
1976; 46 (3): 407-441
View details for Web of Science ID A1976CJ09200004
-
3 EXPERIMENTS ON LEARNING TO TEACH
JOURNAL OF TEACHER EDUCATION
1976; 27 (2): 174-180
View details for Web of Science ID A1976CC17500022
-
METHOD FOR EXAMINING SUBJECT-MATTER STRUCTURE IN INSTRUCTIONAL MATERIAL
JOURNAL OF STRUCTURAL LEARNING
1975; 4 (3): 199-218
View details for Web of Science ID A1975AQ90300002
-
CONSTRUCT VALIDATION - METHODOLOGY AND APPLICATION TO 3 MEASURES OF COGNITIVE STRUCTURE
JOURNAL OF EDUCATIONAL MEASUREMENT
1975; 12 (2): 67-85
View details for Web of Science ID A1975AD82200001
-
SURVIVAL IN FIELD OF EDUCATION AFTER INTERN TRAINING - TRAINING INSTITUTIONS PERSPECTIVE
CALIFORNIA JOURNAL OF EDUCATIONAL RESEARCH
1974; 25 (4): 161-179
View details for Web of Science ID A1974U304500001
-
EFFECTS OF POSITION AND TYPE OF QUESTION ON LEARNING FROM PROSE MATERIAL - INTERACTION OF TREATMENTS WITH INDIVIDUAL-DIFFERENCES
JOURNAL OF EDUCATIONAL PSYCHOLOGY
1974; 66 (1): 40-48
View details for Web of Science ID A1974S160200006
-
CRITERION-REFERENCED TESTING - COMMENTS ON RELIABILITY
JOURNAL OF EDUCATIONAL MEASUREMENT
1972; 9 (2): 133-137
View details for Web of Science ID A1972M503300005