Bio


His current work focuses on the design of assessments and assessment systems that measure college students learning, both their development of competence/ achievement and so-called “soft-skills” such as perspective taking. He co-created the Collegiate Learning Assessment with Steve Klein and built statistical models for estimating value added for the CLA and other college-level assessments. This work is summarized in Measuring College Student Learning: Accountability in a New Era (2010, Stanford University Press) and in recent papers on the measurement and statistical modeling of competence.

Academic Appointments


  • Emeritus Faculty, Acad Council, Graduate School of Education

Administrative Appointments


  • I James Quillan Dean, School of Education (1995 - 2000)
  • Emeritus Professor of Psychology (by courtesy), School of Humanities and Sciences (1995 - Present)
  • Professor Emeritus, Stanford Graduate School of Education (1995 - Present)
  • Emeritus Affiliated Faculty, Stanford Institute for the Environment (2005 - 2007)

Honors & Awards


  • Humboldt Fellowship, Humboldt Foundation, Germany (1994)
  • AERA Review of Research Award, American Educational Research Association (2008)
  • AERA Review of Research Award, American Educational research Association (1978)
  • E.F. Lindquist Award, American Educational Research Association (2011)
  • R.L. Linn Award, American Educational Association (2016)
  • E.L. Thorndike Award, American Psychological Association (2010)

Boards, Advisory Committees, Professional Organizations


  • Assistant Professor of Education, UCLA (1973 - 1975)
  • Associate Professor of Education, UCLA (1975 - 1979)
  • Professor of Education, UCLA (1979 - 1987)
  • Dean, Graduate School of Education, University of California (1987 - 1993)
  • Professor of Education, UCSB (1987 - 1996)
  • Professor of Education, UCSB with an affiliated appointment in Statistics and Applied Probability (1993 - 1996)
  • Vice Chair and Chair, Board on Testing and Assessment National Academy of Science (1993 - 1998)
  • Board Member, Yosemite National Institutes/ NatureBridge (1996 - 2008)
  • Board Member, The Spencer Foundation (1997 - 2005)
  • Education Advisory Council (Chair), NatureBridge (2009 - 2012)
  • Board Member, BSCS (formerly Biological Sciences Curriculum Study) (2009 - 2018)
  • Chair, BSCS Board (2016 - 2018)
  • Member and Vice Chair Governance Committee, Stanford Historical Society (2017 - 2019)

Professional Education


  • PhD, Stanford University, Educational Psychology (1971)
  • MA, San Jose State College, Psychology (1967)
  • BA, University of Oregon, Psychology (1964)

Research Interests


  • Assessment, Testing and Measurement
  • Higher Education
  • Psychology
  • Research Methods

Current Research and Scholarly Interests


Assessment of learning in higher education (including the Collegiate Learning Assessment); accountability in higher education; higher education policy.

2023-24 Courses


Stanford Advisees


All Publications


  • Rescue an Enterprise from Failure: An Innovative Assessment Tool for Simulated Performance ASSESSMENT OF LEARNING OUTCOMES IN HIGHER EDUCATION: CROSS-NATIONAL COMPARISONS AND PERSPECTIVES Oser, F., Mueller, S., Obex, T., Volery, T., Shavelson, R. J., ZlatkinTroitschanskaia, O., Toepper, M., Pant, H. A., Lautenbach, C., Kuhn, C. 2018: 123–44
  • Performance indicators of learning in higher education institutions: an overview of the field RESEARCH HANDBOOK ON QUALITY, PERFORMANCE AND ACCOUNTABILITY IN HIGHER EDUCATION Shavelson, R. J., Zlatkin-Troitschanskaia, O., Marino, J. P., Hazelkorn, E., Coates, H., McCormick, A. C. 2018: 249–63
  • International Performance Assessment of Learning in Higher Education (iPAL): Research and Development ASSESSMENT OF LEARNING OUTCOMES IN HIGHER EDUCATION: CROSS-NATIONAL COMPARISONS AND PERSPECTIVES Shavelson, R. J., Zlatkin-Troitschanskaia, O., Marino, J. P., ZlatkinTroitschanskaia, O., Toepper, M., Pant, H. A., Lautenbach, C., Kuhn, C. 2018: 193–214
  • Gavriel Salomon: In Memoriam EDUCATIONAL PSYCHOLOGY REVIEW Berliner, D., Phillips, D., Zeidner, M., de Corte, E., Shavelson, R., de Ibarrola, M., Clark, R. 2016; 28 (2): 207–13
  • On the practices and challenges of measuring higher education value added: the case of Colombia ASSESSMENT & EVALUATION IN HIGHER EDUCATION Shavelson, R. J., Domingue, B. W., Marino, J. P., Molina Mantilla, A., Morales Forero, A., Wiley, E. E. 2016; 41 (5): 695-720
  • The international state of research on measurement of competency in higher education STUDIES IN HIGHER EDUCATION Zlatkin-Troitschanskaia, O., Shavelson, R. J., Kuhn, C. 2015; 40 (3): 393-411
  • Beyond Dichotomies Competence Viewed as a Continuum ZEITSCHRIFT FUR PSYCHOLOGIE-JOURNAL OF PSYCHOLOGY Blomeke, S., Gustafsson, J., Shavelson, R. J. 2015; 223 (1): 3-13
  • On the Factorial Structure of the SAT and Implications for Next-Generation College Readiness Assessments EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT Wiley, E. W., Shavelson, R. J., Kurpius, A. A. 2014; 74 (5): 859-874
  • Factors Contributing to Problem-Solving Performance in First-Semester Organic Chemistry JOURNAL OF CHEMICAL EDUCATION Lopez, E. J., Shavelson, R. J., Nandagopal, K., Szu, E., Penn, J. 2014; 91 (7): 976-981

    View details for DOI 10.1021/ed400696c

    View details for Web of Science ID 000339090200007

  • Using Formal Embedded Formative Assessments Aligned with a Short-Term Learning Progression to Promote Conceptual Change and Achievement in Science INTERNATIONAL JOURNAL OF SCIENCE EDUCATION Yin, Y., Tomita, M. K., Shavelson, R. J. 2014; 36 (4): 531-552
  • Self-regulated learning study strategies and academic performance in undergraduate organic chemistry: An investigation examining ethnically diverse students JOURNAL OF RESEARCH IN SCIENCE TEACHING Lopez, E. J., Nandagopal, K., Shavelson, R. J., Szu, E., Penn, J. 2013; 50 (6): 660-676

    View details for DOI 10.1002/tea.21095

    View details for Web of Science ID 000322006100003

  • On an Approach to Testing and Modeling Competence EDUCATIONAL PSYCHOLOGIST Shavelson, R. J. 2013; 48 (2): 73-86
  • Context matters: volunteer bias, small sample size, and the value of comparison groups in the assessment of research-based undergraduate introductory biology lab courses. Journal of microbiology & biology education : JMBE Brownell, S. E., Kloser, M. J., Fukami, T., Shavelson, R. J. 2013; 14 (2): 176-182

    Abstract

    The shift from cookbook to authentic research-based lab courses in undergraduate biology necessitates the need for evaluation and assessment of these novel courses. Although the biology education community has made progress in this area, it is important that we interpret the effectiveness of these courses with caution and remain mindful of inherent limitations to our study designs that may impact internal and external validity. The specific context of a research study can have a dramatic impact on the conclusions. We present a case study of our own three-year investigation of the impact of a research-based introductory lab course, highlighting how volunteer students, a lack of a comparison group, and small sample sizes can be limitations of a study design that can affect the interpretation of the effectiveness of a course.

    View details for DOI 10.1128/jmbe.v14i2.609

    View details for PubMedID 24358380

  • Understanding Academic Performance in Organic Chemistry JOURNAL OF CHEMICAL EDUCATION Szu, E., Nandagopal, K., Shavelson, R. J., Lopez, E. J., Penn, J. H., Scharberg, M., Hill, G. W. 2011; 88 (9): 1238-1242

    View details for DOI 10.1021/ed900067m

    View details for Web of Science ID 000293813100011

  • Validating the use of concept-mapping as a diagnostic assessment tool in organic chemistry: implications for teaching CHEMISTRY EDUCATION RESEARCH AND PRACTICE Lopez, E., Kim, J., Nandagopal, K., Cardin, N., Shavelson, R. J., Penn, J. H. 2011; 12 (2): 133-141

    View details for DOI 10.1039/C1RP90018H

    View details for Web of Science ID 000289951500003

  • Supporting Valid Interpretations of Learning Progression Level Diagnoses JOURNAL OF RESEARCH IN SCIENCE TEACHING Steedle, J. T., Shavelson, R. J. 2009; 46 (6): 699-715

    View details for DOI 10.1002/tea.20308

    View details for Web of Science ID 000268606600007

  • Generalizability theory and its contribution to the discussion of the generalizability of research findings Generalizing from educational research Shavelson, R. J., Webb, N. M. 2009: 13-32
  • The limitations of portfolios Inside Higher Ed Shavelson, R. J., Klein, S., Benjamin, R. 2009; 16
  • Supporting valid interpretations of learning progression level diagnoses Journal of Research in Science Teaching: The Official Journal of the National Association for Research in Science Teaching Steedle, J. T., Shavelson, R. J. 2009; 46 (6): 699-715
  • Direct measures in environmental education evaluation: Behavioral intentions versus observable actions Applied Environmental Education and Communication Camargo, C., Shavelson, R. 2009; 8 (3-4): 165-173
  • Assessing School Effectiveness EVALUATION REVIEW Klein, S., Freedman, D., Shavelson, R., Bolus, R. 2008; 32 (6): 511-525

    Abstract

    The Collegiate Learning Assessment (CLA) program measures value added in colleges and universities, by testing the ability of freshmen and seniors to think logically and write clearly. The program is popular enough that it has attracted critics. In this paper, we outline the methods used by the CLA to determine value added. We summarize the criticisms, which revolve around the question of which students take the CLA tests. Typically, samples are not random, so that selection bias is a concern, as is confounding. We respond by showing that criticisms of CLA procedures are not supported by the data.

    View details for DOI 10.1177/0193841X08325948

    View details for Web of Science ID 000260738700001

    View details for PubMedID 18981333

  • Measuring Knowledge Structure: Reliability of Concept Mapping Assessment in Medical Education ACADEMIC MEDICINE Srinivasan, M., McElvany, M., Shay, J. M., Shavelson, R. J., West, D. C. 2008; 83 (12): 1196-1203

    Abstract

    To test the reliability of concept map assessment, which can be used to assess an individual's "knowledge structure," in a medical education setting.In 2004, 52 senior residents (pediatrics and internal medicine) and fourth-year medical students at the University of California-Davis School of Medicine created separate concept maps about two different subject domains (asthma and diabetes) on two separate occasions each (four total maps). Maps were rated using four different scoring systems: structural (S; counting propositions), quality (Q; rating the quality of propositions), importance/quality (I/Q; rating importance and quality of propositions), and a hybrid system (H; combining elements of S with I/Q). The authors used generalizability theory to determine reliability.Learners (universe score) contributed 40% to 44% to total score variation for the Q, I/Q, and H scoring systems, but only 10% for the S scoring system. There was a large learner-occasion-domain interaction effect (19%-23%). Subsequent analysis of each subject domain separately demonstrated a large learner-occasion interaction effect (31%-37%) and determined that administration on four to five occasions was necessary to achieve adequate reliability. Rater variation was uniformly low.The Q, I/Q, and H scoring systems demonstrated similar reliability and were all more reliable than the S system. The findings suggest that training and practice are required to perform the assessment task, and, as administered in this study, four to five testing occasions are required to achieve adequate reliability. Further research should focus on whether alterations in the concept mapping task could allow it to be administered over fewer occasions while maintaining adequate reliability.

    View details for Web of Science ID 000267654800030

    View details for PubMedID 19202500

  • Application of Generalizability theory to concept map assessment research APPLIED MEASUREMENT IN EDUCATION Yin, Y., Shavelson, R. J. 2008; 21 (3): 273-291
  • Reflections on quantitative reasoning: An assessment perspective Calculation vs. context: Quantitative literacy and its implications for teacher education Shavelson, R. J. 2008: 27-47
  • Application of generalizability theory to concept map assessment research Applied Measurement in Education Yin, Y., Shavelson, R. J. 2008; 21 (3): 273-291
  • Measuring knowledge structure: Reliability of concept mapping assessment in medical education Academic Medicine Srinivasan, M., McElvany, M., Shay, J. M., Shavelson, R. J., West, D. C. 2008; 83 (12): 1196-1203
  • Assessing school effectiveness Evaluation Review Klein, S., Freedman, D., Shavelson, R., Bolus, R. 2008; 32 (6): 511-525
  • On the Impact of Formative Assessment on Student Motivation, Achievement, and Conceptual Change APPLIED MEASUREMENT IN EDUCATION Yin, Y., Shavelson, R. J., Ayala, C. C., Ruiz-Primo, M. A., Brandon, P. R., Furtak, E. M., Tomita, M. K., Young, D. B. 2008; 21 (4): 335-359
  • On the Fidelity of Implementing Embedded Formative Assessments and Its Relation to Student Learning APPLIED MEASUREMENT IN EDUCATION Furtak, E. M., Ruiz-Primo, M. A., Shemwell, J. T., Ayala, C. C., Brandon, P. R., Shavelson, R. J., Yin, Y. 2008; 21 (4): 360-389
  • On the Impact of Curriculum-Embedded Formative Assessment on Learning: A Collaboration between Curriculum and Assessment Developers APPLIED MEASUREMENT IN EDUCATION Shavelson, R. J., Young, D. B., Ayala, C. C., Brandon, P. R., Furtak, E. M., Ruiz-Primo, M. A., Tomita, M. K., Yin, Y. 2008; 21 (4): 295-314
  • From Formal Embedded Assessments to Reflective Lessons: The Development of Formative Assessment Studies APPLIED MEASUREMENT IN EDUCATION Ayala, C. C., Shavelson, R. J., Ruiz-Primo, M. A., Brandon, P. R., Yin, Y., Furtak, E. M., Young, D. B. 2008; 21 (4): 315-334
  • Teaching effectiveness research in the past decade: The role of theory and research design in disentangling meta-analysis results REVIEW OF EDUCATIONAL RESEARCH Seidel, T., Shavelson, R. J. 2007; 77 (4): 454-499
  • The collegiate learning assessment - Facts and fantasies EVALUATION REVIEW Klein, S., Benjamin, R., Shavelson, R., Bolus, R. 2007; 31 (5): 415-439

    Abstract

    The Collegiate Learning Assessment (CLA) is a computer administered, open-ended (as opposed to multiple-choice) test of analytic reasoning, critical thinking, problem solving, and written communication skills. Because the CLA has been endorsed by several national higher education commissions, it has come under intense scrutiny by faculty members, college administrators, testing experts, legislators, and others. This article describes the CLA's measures and what they do and do not assess, how dependably they measure what they claim to measure, and how CLA scores differ from those on other direct and indirect measures of college student learning. For instance, analyses are conducted at the school rather than the student level and results are adjusted for input to assess whether the progress students are making at their school is better or worse than what would be expected given the progress of "similarly situated" students (in terms of incoming ability) at other colleges.

    View details for DOI 10.1177/0193841X07303318

    View details for Web of Science ID 000249431100001

    View details for PubMedID 17761805

  • Criterion-based training with surgical simulators: proficiency of experienced surgeons. JSLS : Journal of the Society of Laparoendoscopic Surgeons / Society of Laparoendoscopic Surgeons Heinrichs, W. L., Lukoff, B., Youngblood, P., Dev, P., Shavelson, R., Hasson, H. M., Satava, R. M., McDougall, E. M., Wetter, P. A. 2007; 11 (3): 273-302

    Abstract

    In our effort to establish criterion-based skills training for surgeons, we assessed the performance of 17 experienced laparoscopic surgeons on basic technical surgical skills recorded electronically in 26 modules selected in 5 commercially available, computer-based simulators.Performance data were derived from selected surgeons randomly assigned to simulator stations, and practicing repetitively during one and one-half day sessions on 5 different simulators. We measured surgeon proficiency defined as efficient, error-free performance and developed proficiency score formulas for each module. Demographic and opinion data were also collected.Surgeons' performance demonstrated a sharp learning curve with the most performance improvement seen in early practice attempts. Median scores and performance levels at the 10th, 25th, 75th, and 90th percentiles are provided for each module. Construct validity was examined for 2 modules by comparing experienced surgeons' performance with that of a convenience sample of less-experienced surgeons.A simple mathematical method for scoring performance is applicable to these simulators. Proficiency levels for training courses can now be specified objectively by residency directors and by professional organizations for different levels of training or post-training assessment of technical performance. But data users should be cautious due to the small sample size in this study and the need for further study into the reliability and validity of the use of surgical simulators as assessment tools.

    View details for PubMedID 17931510

  • The collegiate learning assessment: Facts and fantasies Evaluation Review Klein, S., Benjamin, R., Shavelson, R., Bolus, R. 2007; 31 (5): 415-439
  • Windows into the mind HIGHER EDUCATION Shavelson, R. J., Ruiz-Primo, M. A., Wiley, E. W. 2005; 49 (4): 413-430
  • An approach to measuring cognitive outcomes across higher education institutions RESEARCH IN HIGHER EDUCATION Klein, S. P., Kuh, G. D., Chun, M., Hamilton, L., SHAVELSON, R. 2005; 46 (3): 251-276
  • Comparison of two concept-mapping techniques: Implications for scoring, interpretation, and use JOURNAL OF RESEARCH IN SCIENCE TEACHING Yin, Y., Vanides, J., Ruiz-Primo, M. A., Ayala, C. C., SHAVELSON, R. J. 2005; 42 (2): 166-184

    View details for DOI 10.1002/tea.20049

    View details for Web of Science ID 000227024700002

  • Evaluating students' science notebooks as an assessment tool INTERNATIONAL JOURNAL OF SCIENCE EDUCATION Ruiz-Primo, M. A., Li, M., Ayala, C., Shavelson, R. J. 2004; 26 (12): 1477-1506
  • Lee J. Cronbach. Proceedings of the American Philosophical Society Shavelson, R. J. 2003; 147 (4): 379-385

    View details for PubMedID 15025124

  • On the evaluation of systemic science education reform: Searching for instructional sensitivity JOURNAL OF RESEARCH IN SCIENCE TEACHING Ruiz-Primo, M. A., SHAVELSON, R. J., Hamilton, L., Klein, S. 2002; 39 (5): 369-393

    View details for DOI 10.1002/tea.10027

    View details for Web of Science ID 000175090400001

  • Comparison of the reliability and validity of scores from two concept-mapping techniques JOURNAL OF RESEARCH IN SCIENCE TEACHING Ruiz-Primo, M. A., Schultz, S. E., Li, M., SHAVELSON, R. J. 2001; 38 (2): 260-278
  • The effects of content, format, and inquiry level on science performance assessment scores APPLIED MEASUREMENT IN EDUCATION Stecher, B. M., Klein, S. P., Solano-Flores, G., McCaffrey, D., Robyn, A., SHAVELSON, R. J., HAERTEL, E. 2000; 13 (2): 139-160
  • Note on sources of sampling variability in science performance assessments JOURNAL OF EDUCATIONAL MEASUREMENT Shavelson, R. J., Ruiz-Primo, M. A., Wiley, E. W. 1999; 36 (1): 61-71
  • On the development and evaluation of a shell for generating science performance assessments INTERNATIONAL JOURNAL OF SCIENCE EDUCATION Solano-Flores, G., Jovanovic, J., SHAVELSON, R. J., Bachman, M. 1999; 21 (3): 293-315
  • Toward a science performance assessment technology 7th EARLI Conference SHAVELSON, R. J., Solano-Flores, G., Ruiz-Primo, M. A. PERGAMON-ELSEVIER SCIENCE LTD. 1998: 171–84
  • Analytic versus holistic scoring of science performance tasks APPLIED MEASUREMENT IN EDUCATION Klein, S. P., Stecher, B. M., SHAVELSON, R. J., McCaffrey, D., Ormseth, T., Bell, R. M., Comfort, K., Othman, A. R. 1998; 11 (2): 121-137
  • Gender and racial/ethnic differences on performance assessments in science EDUCATIONAL EVALUATION AND POLICY ANALYSIS Klein, S. P., Jovanovic, J., Stecher, B. M., McCaffrey, D., SHAVELSON, R. J., HAERTEL, E., SOLANOFLORES, G., Comfort, K. 1997; 19 (2): 83-97
  • Rhetoric and reality in science performance assessments: An update JOURNAL OF RESEARCH IN SCIENCE TEACHING RUIZPRIMO, M. A., SHAVELSON, R. J. 1996; 33 (10): 1045-1063
  • Problems and issues in the use of concept maps in science assessment JOURNAL OF RESEARCH IN SCIENCE TEACHING RUIZPRIMO, M. A., SHAVELSON, R. J. 1996; 33 (6): 569-600
  • On the structure of social self-concept for pre-, early, and late adolescents: A test of the Shavelson, Hubner, and Stanton (1976) model JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY Byrne, B. M., SHAVELSON, R. J. 1996; 70 (3): 599-613

    Abstract

    This study is the first to empirically validate the social self-concept component of the R. J. Shavelson, J. J. Hubner, and G. C. Stanton (1976) model. The primary purpose was to test for each of 3 age groups--preadolescents (Grade 3), early adolescents (Grade 7), and late adolescents (Grade 11)-3 hypotheses bearing on the structure of social self-concept within the context of this model: (a) that it is multidimensional, (b) that it is hierarchically ordered, and (c) that it becomes increasingly differentiated with age. Given evidence of a hierarchical social self-concept structure, a secondary focus of the study was to determine the extent to which this pattern held across age. On the basis of the analysis of covariance structures within the framework of confirmatory factor analysis, results revealed a multidimensional social self-concept structure that becomes increasingly differentiated and a hierarchical ordering that becomes better defined with age. Overall, findings were consistent with both the R. J. Shavelson et al. (1976) conceptualization of self-concept structure and developmental processes that underlie self-concept formation.

    View details for Web of Science ID A1996TZ88300015

    View details for PubMedID 8851744

  • ON GETTING IT RIGHT EDUCATIONAL EVALUATION AND POLICY ANALYSIS SHAVELSON, R. J., NOREEN, N. M. 1995; 17 (3): 275-279
  • SELF-CONCEPT - VALIDATION OF CONSTRUCT INTERPRETATIONS REVIEW OF EDUCATIONAL RESEARCH SHAVELSON, R. J., HUBNER, J. J., STANTON, G. C. 1976; 46 (3): 407-441
  • 3 EXPERIMENTS ON LEARNING TO TEACH JOURNAL OF TEACHER EDUCATION Clark, C. M., Snow, R. E., SHAVELSON, R. J. 1976; 27 (2): 174-180
  • METHOD FOR EXAMINING SUBJECT-MATTER STRUCTURE IN INSTRUCTIONAL MATERIAL JOURNAL OF STRUCTURAL LEARNING SHAVELSON, R. J., GEESLIN, W. E. 1975; 4 (3): 199-218
  • CONSTRUCT VALIDATION - METHODOLOGY AND APPLICATION TO 3 MEASURES OF COGNITIVE STRUCTURE JOURNAL OF EDUCATIONAL MEASUREMENT SHAVELSON, R. J., STANTON, G. C. 1975; 12 (2): 67-85
  • SURVIVAL IN FIELD OF EDUCATION AFTER INTERN TRAINING - TRAINING INSTITUTIONS PERSPECTIVE CALIFORNIA JOURNAL OF EDUCATIONAL RESEARCH SHAVELSON, R. J., TRINCHER, R. L. 1974; 25 (4): 161-179
  • EFFECTS OF POSITION AND TYPE OF QUESTION ON LEARNING FROM PROSE MATERIAL - INTERACTION OF TREATMENTS WITH INDIVIDUAL-DIFFERENCES JOURNAL OF EDUCATIONAL PSYCHOLOGY SHAVELSON, R. J., BERLINER, D. C., Ravitch, M. M., LOEDING, D. 1974; 66 (1): 40-48
  • CRITERION-REFERENCED TESTING - COMMENTS ON RELIABILITY JOURNAL OF EDUCATIONAL MEASUREMENT SHAVELSON, R. J., Block, J. H., Ravitch, M. M. 1972; 9 (2): 133-137