Edward Haertel

Jacks Family Professor of Education, Emeritus

Graduate School of Education

Bio

Dr. Haertel is an expert in the area of educational testing and assessment. His research and teaching focus on psychometrics and educational policy, especially test-based accountability and related policy uses of test data. His recent work has examined standard setting methods, limitations of value-added models for teacher and school accountability, impacts of testing on curriculum, students, and educational policy, test reliability, and generalizability theory.

Academic Appointments

Emeritus Faculty, Acad Council, Graduate School of Education

Administrative Appointments

Jacks Family Professor of Education, Emeritus, Stanford Graduate School of Education (2013 - Present)
Jacks Family Professor of Education, Stanford Graduate School of Education (2008 - 2012)
Associate Dean for Faculty Affairs, Stanford Graduate School of Education (2005 - 2010)
Professor of Education, Stanford Graduate School of Education (1992 - 2008)
Associate Professor of Education, Stanford Graduate School of Education (1987 - 1992)
Assistant Professor of Education, Stanford Graduate School of Education (1980 - 1987)

Boards, Advisory Committees, Professional Organizations

Member, Technical Design Group, California Department of Education, Assessment and Accountability Unit (2015 - Present)
Member, Smarter Balanced Assessment Consortium Technical Advisory Committee (2019 - Present)
Assistant Professor, University of Illinois, Chicago (1979 - 1980)

Professional Education

PhD, University of Chicago, Measurement, Evaluation and Statistical Analysis (1980)
BA, University of Wisconsin-Madison, Mathematics (1971)

Contact

Academic
haertel@stanford.edu
University - Emeritus faculty Department: Graduate School of Education Position: Emeritus Faculty, Acad Council
- SCHOOL OF EDUCATION
- 3096
- Stanford, California 94305-3096
(650) 725-7412 (fax)

Admin. Support Elayne Weissler-Martello elayne@stanford.edu

Additional Info

Mail Code: 3009

Research Interests

Assessment, Testing and Measurement
International and Comparative Education
School Reform
Standards
Teachers and Teaching

Current Research and Scholarly Interests

Functions of test scores in discourse about education; how testing shapes ideas of success and failure for students, schools, and public education as a whole.

2023-24 Courses

Independent Studies (8)
- Directed Reading
  EDUC 480 (Aut, Win, Spr, Sum)
- Directed Reading in Education
  EDUC 180 (Aut, Win, Spr, Sum)
- Directed Research
  EDUC 490 (Aut, Win, Spr, Sum)
- Directed Research in Education
  EDUC 190 (Aut, Win, Spr, Sum)
- Honors Research
  EDUC 140 (Aut)
- Master's Thesis
  EDUC 185 (Aut, Win, Sum)
- Practicum
  EDUC 470 (Aut, Win, Spr, Sum)
- Supervised Internship
  EDUC 380 (Aut, Win, Spr, Sum)

All Publications

Comparability of Large-Scale Educational Assessments: Issues and Recommendations. edited by Berman, A. I., Haertel, E. H., Pellegrino, J. W. National Academy of Education. 2020

View details for DOI 10.31094/2020/1
The Testing Charade: Pretending to Make Schools Better (Book Review) AMERICAN JOURNAL OF EDUCATION Book Review Authored by: Haertel, E. H. 2018; 124 (3): 373–77

View details for Web of Science ID 000430443400005
Measuring Cultural Dimensions of Classroom Interactions EDUCATIONAL ASSESSMENT Jensen, B., Grajeda, S., Haertel, E. 2018; 23 (4): 250–76

View details for DOI 10.1080/10627197.2018.1515010

View details for Web of Science ID 000446134500002
Tests, Test Scores, and Constructs EDUCATIONAL PSYCHOLOGIST Haertel, E. H. 2018; 53 (3): 203–16

View details for DOI 10.1080/00461520.2018.1476868

View details for Web of Science ID 000443865300004
Fairness using derived scores Fairness in Educational Assessment and Measurement Haertel, E., Ho, A. Routledge. 2016: 233–254
Engaging methodological pluralism Handbook of research on teaching Moss, P. A., Haertel, E. H. 2016: 127-247
Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items APPLIED MEASUREMENT IN EDUCATION Michaelides, M. P., Haertel, E. H. 2014; 27 (1): 46-57

View details for DOI 10.1080/08957347.2013.853069

View details for Web of Science ID 000329424600004
Selection of common items as an unrecognized source of variability in test equating: A bootstrap approximation assuming random sampling of common items Applied Measurement in Education Michaelides, M. P., Haertel, E. H. 2014; 27 (1): 46-57
Getting the Help We Need JOURNAL OF EDUCATIONAL MEASUREMENT Haertel, E. 2013; 50 (1): 84-90

View details for DOI 10.1111/jedm.12002

View details for Web of Science ID 000316286300003
Reliability and Validity of Inferences about Teachers Based on Student Scores. William H. Angoff Memorial Lecture Series. Educational Testing Service Haertel, E. H. 2013
Improving ability measurement in surveys by following the principles of IRT: The Wordsum vocabulary test in the General Social Survey SOCIAL SCIENCE RESEARCH Cor, M. K., Haertel, E., Krosnick, J. A., Malhotra, N. 2012; 41 (5): 1003-1016

Abstract

Survey researchers often administer batteries of questions to measure respondents' abilities, but these batteries are not always designed in keeping with the principles of optimal test construction. This paper illustrates one instance in which following these principles can improve a measurement tool used widely in the social and behavioral sciences: the GSS's vocabulary test called "Wordsum". This ten-item test is composed of very difficult items and very easy items, and item response theory (IRT) suggests that the omission of moderately difficult items is likely to have handicapped Wordsum's effectiveness. Analyses of data from national samples of thousands of American adults show that after adding four moderately difficult items to create a 14-item battery, "Wordsumplus" (1) outperformed the original battery in terms of quality indicators suggested by classical test theory; (2) reduced the standard error of IRT ability estimates in the middle of the latent ability dimension; and (3) exhibited higher concurrent validity. These findings show how to improve Wordsum and suggest that analysts should use a score based on all 14 items instead of using the summary score provided by the GSS, which is based on only the original 10 items. These results also show more generally how surveys measuring abilities (and other constructs) can benefit from careful application of insights from the contemporary educational testing literature.

View details for DOI 10.1016/j.ssresearch.2012.05.007

View details for Web of Science ID 000306620600001

View details for PubMedID 23017913
Evaluating teacher evaluation PHI DELTA KAPPAN Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., Rothstein, J. 2012; 93 (6): 8-15

View details for Web of Science ID 000301306000005
The briefing book method Setting performance standards: Foundations, methods, and innovations Haertel, E. H., Beimers, J. N., Miles, J. A. 2012: 283-299
Evaluating teacher evaluation Phi Delta Kappan Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., Rothstein, J. 2012; 93 (6): 8-15
The Effect of Ignoring Classroom‐Level Variance in Estimating the Generalizability of School Mean Scores Educational Measurement: Issues and Practice Wei, X., Haertel, E. 2011; 30 (1): 13-22
Medicine on a need-to-know basis NATURE IMMUNOLOGY Busch, R., Byrne, B., Gandrud, L., Sears, D., Meyer, E., Kattah, M., Kurihara, C., Haertel, E., Parnes, J. R., Mellins, E. D. 2006; 7 (6): 543-547

Abstract

Disease-oriented, introductory medical curricula can help overcome educational and institutional barriers that separate aspiring translational scientists in PhD programs from the world of medicine.

View details for Web of Science ID 000237751200004

View details for PubMedID 16715061
The effects of content, format, and inquiry level on science performance assessment scores APPLIED MEASUREMENT IN EDUCATION Stecher, B. M., Klein, S. P., Solano-Flores, G., McCaffrey, D., Robyn, A., SHAVELSON, R. J., HAERTEL, E. 2000; 13 (2): 139-160

View details for Web of Science ID 000086300300002
Performance assessment and education reform PHI DELTA KAPPAN Haertel, E. H. 1999; 80 (9): 662-666

View details for Web of Science ID 000080073300006
Gender and racial/ethnic differences on performance assessments in science EDUCATIONAL EVALUATION AND POLICY ANALYSIS Klein, S. P., Jovanovic, J., Stecher, B. M., McCaffrey, D., SHAVELSON, R. J., HAERTEL, E., SOLANOFLORES, G., Comfort, K. 1997; 19 (2): 83-97

View details for Web of Science ID A1997XE01200001
Generalizability analysis for performance assessments of student achievement or school effectiveness EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT Cronbach, L. J., LINN, R. L., Brennan, R. L., Haertel, E. H. 1997; 57 (3): 373-399

View details for Web of Science ID A1997WY08700001
COMPONENTS OF INTERESTING SCIENCE EXPERIMENTS SCIENCE EDUCATION Martinez, M. E., HAERTEL, E. 1991; 75 (4): 471-479

View details for Web of Science ID A1991FT41100007
I NEVER PROMISED YOU 1ST PLACE - A REJOINDER PHI DELTA KAPPAN BRADBURN, N., HAERTEL, E., Schwille, J., TORNEYPURTA, J. 1991; 72 (10): 774-777

View details for Web of Science ID A1991FP58200014
CONTINUOUS AND DISCRETE LATENT STRUCTURE MODELS FOR ITEM RESPONSE DATA PSYCHOMETRIKA Haertel, E. H. 1990; 55 (3): 477-494

View details for Web of Science ID A1990EE53700005
USING RESTRICTED LATENT CLASS MODELS TO MAP THE SKILL STRUCTURE OF ACHIEVEMENT ITEMS JOURNAL OF EDUCATIONAL MEASUREMENT Haertel, E. H. 1989; 26 (4): 301-321

View details for Web of Science ID A1989DA96600001
BUYERS BEWARE - THE DECEPTIVELY HIGH COST OF LISREL COUNSELING PSYCHOLOGIST Haertel, E. H., Thoresen, C. E. 1987; 15 (2): 316-319

View details for Web of Science ID A1987H127000009
MEASURING SCHOOL PERFORMANCE TO IMPROVE SCHOOL PRACTICE EDUCATION AND URBAN SOCIETY HAERTEL, E. 1986; 18 (3): 312-325

View details for Web of Science ID A1986C633700004
CONSTRUCT-VALIDITY AND CRITERION-REFERENCED TESTING REVIEW OF EDUCATIONAL RESEARCH HAERTEL, E. 1985; 55 (1): 23-46

View details for Web of Science ID A1985AHE9800004
DETECTION OF A SKILL DICHOTOMY USING STANDARDIZED ACHIEVEMENT-TEST ITEMS JOURNAL OF EDUCATIONAL MEASUREMENT HAERTEL, E. 1984; 21 (1): 59-72

View details for Web of Science ID A1984SF54200005
AN APPLICATION OF LATENT CLASS MODELS TO ASSESSMENT DATA APPLIED PSYCHOLOGICAL MEASUREMENT HAERTEL, E. 1984; 8 (3): 333-346

View details for Web of Science ID A1984TU02600011
SCHOOL-ACHIEVEMENT - THINKING ABOUT WHAT TO TEST JOURNAL OF EDUCATIONAL MEASUREMENT HAERTEL, E., Calfee, R. 1983; 20 (2): 119-132

View details for Web of Science ID A1983QS78800003
THE IMPACT OF LEISURE-TIME TELEVISION ON SCHOOL LEARNING - A RESEARCH SYNTHESIS AMERICAN EDUCATIONAL RESEARCH JOURNAL Williams, P. A., Haertel, E. H., HAERTEL, G. D., WALBERG, H. J. 1982; 19 (1): 19-50

View details for Web of Science ID A1982ND51000002