Edward Haertel
Jacks Family Professor of Education, Emeritus
Graduate School of Education
Bio
Dr. Haertel is an expert in the area of educational testing and assessment. His research and teaching focus on psychometrics and educational policy, especially test-based accountability and related policy uses of test data. His recent work has examined standard setting methods, limitations of value-added models for teacher and school accountability, impacts of testing on curriculum, students, and educational policy, test reliability, and generalizability theory.
Administrative Appointments
-
Jacks Family Professor of Education, Emeritus, Stanford Graduate School of Education (2013 - Present)
-
Jacks Family Professor of Education, Stanford Graduate School of Education (2008 - 2012)
-
Associate Dean for Faculty Affairs, Stanford Graduate School of Education (2005 - 2010)
-
Professor of Education, Stanford Graduate School of Education (1992 - 2008)
-
Associate Professor of Education, Stanford Graduate School of Education (1987 - 1992)
-
Assistant Professor of Education, Stanford Graduate School of Education (1980 - 1987)
Boards, Advisory Committees, Professional Organizations
-
Member, Technical Design Group, California Department of Education, Assessment and Accountability Unit (2015 - Present)
-
Member, Smarter Balanced Assessment Consortium Technical Advisory Committee (2019 - Present)
-
Assistant Professor, University of Illinois, Chicago (1979 - 1980)
Professional Education
-
PhD, University of Chicago, Measurement, Evaluation and Statistical Analysis (1980)
-
BA, University of Wisconsin-Madison, Mathematics (1971)
Research Interests
-
Assessment, Testing and Measurement
-
International and Comparative Education
-
School Reform
-
Standards
-
Teachers and Teaching
Current Research and Scholarly Interests
Functions of test scores in discourse about education; how testing shapes ideas of success and failure for students, schools, and public education as a whole.
2023-24 Courses
-
Independent Studies (8)
- Directed Reading
EDUC 480 (Aut, Win, Spr, Sum) - Directed Reading in Education
EDUC 180 (Aut, Win, Spr, Sum) - Directed Research
EDUC 490 (Aut, Win, Spr, Sum) - Directed Research in Education
EDUC 190 (Aut, Win, Spr, Sum) - Honors Research
EDUC 140 (Aut) - Master's Thesis
EDUC 185 (Aut, Win, Sum) - Practicum
EDUC 470 (Aut, Win, Spr, Sum) - Supervised Internship
EDUC 380 (Aut, Win, Spr, Sum)
- Directed Reading
All Publications
-
Comparability of Large-Scale Educational Assessments: Issues and Recommendations.
edited by Berman, A. I., Haertel, E. H., Pellegrino, J. W.
National Academy of Education. 2020
View details for DOI 10.31094/2020/1
-
The Testing Charade: Pretending to Make Schools Better (Book Review)
AMERICAN JOURNAL OF EDUCATION
2018; 124 (3): 373–77
View details for Web of Science ID 000430443400005
-
Measuring Cultural Dimensions of Classroom Interactions
EDUCATIONAL ASSESSMENT
2018; 23 (4): 250–76
View details for DOI 10.1080/10627197.2018.1515010
View details for Web of Science ID 000446134500002
-
Tests, Test Scores, and Constructs
EDUCATIONAL PSYCHOLOGIST
2018; 53 (3): 203–16
View details for DOI 10.1080/00461520.2018.1476868
View details for Web of Science ID 000443865300004
- Fairness using derived scores Fairness in Educational Assessment and Measurement Routledge. 2016: 233–254
- Engaging methodological pluralism Handbook of research on teaching 2016: 127-247
-
Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items
APPLIED MEASUREMENT IN EDUCATION
2014; 27 (1): 46-57
View details for DOI 10.1080/08957347.2013.853069
View details for Web of Science ID 000329424600004
- Selection of common items as an unrecognized source of variability in test equating: A bootstrap approximation assuming random sampling of common items Applied Measurement in Education 2014; 27 (1): 46-57
-
Getting the Help We Need
JOURNAL OF EDUCATIONAL MEASUREMENT
2013; 50 (1): 84-90
View details for DOI 10.1111/jedm.12002
View details for Web of Science ID 000316286300003
- Reliability and Validity of Inferences about Teachers Based on Student Scores. William H. Angoff Memorial Lecture Series. Educational Testing Service 2013
-
Improving ability measurement in surveys by following the principles of IRT: The Wordsum vocabulary test in the General Social Survey
SOCIAL SCIENCE RESEARCH
2012; 41 (5): 1003-1016
Abstract
Survey researchers often administer batteries of questions to measure respondents' abilities, but these batteries are not always designed in keeping with the principles of optimal test construction. This paper illustrates one instance in which following these principles can improve a measurement tool used widely in the social and behavioral sciences: the GSS's vocabulary test called "Wordsum". This ten-item test is composed of very difficult items and very easy items, and item response theory (IRT) suggests that the omission of moderately difficult items is likely to have handicapped Wordsum's effectiveness. Analyses of data from national samples of thousands of American adults show that after adding four moderately difficult items to create a 14-item battery, "Wordsumplus" (1) outperformed the original battery in terms of quality indicators suggested by classical test theory; (2) reduced the standard error of IRT ability estimates in the middle of the latent ability dimension; and (3) exhibited higher concurrent validity. These findings show how to improve Wordsum and suggest that analysts should use a score based on all 14 items instead of using the summary score provided by the GSS, which is based on only the original 10 items. These results also show more generally how surveys measuring abilities (and other constructs) can benefit from careful application of insights from the contemporary educational testing literature.
View details for DOI 10.1016/j.ssresearch.2012.05.007
View details for Web of Science ID 000306620600001
View details for PubMedID 23017913
-
Evaluating teacher evaluation
PHI DELTA KAPPAN
2012; 93 (6): 8-15
View details for Web of Science ID 000301306000005
- The briefing book method Setting performance standards: Foundations, methods, and innovations 2012: 283-299
- Evaluating teacher evaluation Phi Delta Kappan 2012; 93 (6): 8-15
- The Effect of Ignoring Classroom‐Level Variance in Estimating the Generalizability of School Mean Scores Educational Measurement: Issues and Practice 2011; 30 (1): 13-22
-
Medicine on a need-to-know basis
NATURE IMMUNOLOGY
2006; 7 (6): 543-547
Abstract
Disease-oriented, introductory medical curricula can help overcome educational and institutional barriers that separate aspiring translational scientists in PhD programs from the world of medicine.
View details for Web of Science ID 000237751200004
View details for PubMedID 16715061
-
The effects of content, format, and inquiry level on science performance assessment scores
APPLIED MEASUREMENT IN EDUCATION
2000; 13 (2): 139-160
View details for Web of Science ID 000086300300002
-
Performance assessment and education reform
PHI DELTA KAPPAN
1999; 80 (9): 662-666
View details for Web of Science ID 000080073300006
-
Gender and racial/ethnic differences on performance assessments in science
EDUCATIONAL EVALUATION AND POLICY ANALYSIS
1997; 19 (2): 83-97
View details for Web of Science ID A1997XE01200001
-
Generalizability analysis for performance assessments of student achievement or school effectiveness
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT
1997; 57 (3): 373-399
View details for Web of Science ID A1997WY08700001
-
COMPONENTS OF INTERESTING SCIENCE EXPERIMENTS
SCIENCE EDUCATION
1991; 75 (4): 471-479
View details for Web of Science ID A1991FT41100007
-
I NEVER PROMISED YOU 1ST PLACE - A REJOINDER
PHI DELTA KAPPAN
1991; 72 (10): 774-777
View details for Web of Science ID A1991FP58200014
-
CONTINUOUS AND DISCRETE LATENT STRUCTURE MODELS FOR ITEM RESPONSE DATA
PSYCHOMETRIKA
1990; 55 (3): 477-494
View details for Web of Science ID A1990EE53700005
-
USING RESTRICTED LATENT CLASS MODELS TO MAP THE SKILL STRUCTURE OF ACHIEVEMENT ITEMS
JOURNAL OF EDUCATIONAL MEASUREMENT
1989; 26 (4): 301-321
View details for Web of Science ID A1989DA96600001
-
BUYERS BEWARE - THE DECEPTIVELY HIGH COST OF LISREL
COUNSELING PSYCHOLOGIST
1987; 15 (2): 316-319
View details for Web of Science ID A1987H127000009
-
MEASURING SCHOOL PERFORMANCE TO IMPROVE SCHOOL PRACTICE
EDUCATION AND URBAN SOCIETY
1986; 18 (3): 312-325
View details for Web of Science ID A1986C633700004
-
CONSTRUCT-VALIDITY AND CRITERION-REFERENCED TESTING
REVIEW OF EDUCATIONAL RESEARCH
1985; 55 (1): 23-46
View details for Web of Science ID A1985AHE9800004
-
DETECTION OF A SKILL DICHOTOMY USING STANDARDIZED ACHIEVEMENT-TEST ITEMS
JOURNAL OF EDUCATIONAL MEASUREMENT
1984; 21 (1): 59-72
View details for Web of Science ID A1984SF54200005
-
AN APPLICATION OF LATENT CLASS MODELS TO ASSESSMENT DATA
APPLIED PSYCHOLOGICAL MEASUREMENT
1984; 8 (3): 333-346
View details for Web of Science ID A1984TU02600011
-
SCHOOL-ACHIEVEMENT - THINKING ABOUT WHAT TO TEST
JOURNAL OF EDUCATIONAL MEASUREMENT
1983; 20 (2): 119-132
View details for Web of Science ID A1983QS78800003
-
THE IMPACT OF LEISURE-TIME TELEVISION ON SCHOOL LEARNING - A RESEARCH SYNTHESIS
AMERICAN EDUCATIONAL RESEARCH JOURNAL
1982; 19 (1): 19-50
View details for Web of Science ID A1982ND51000002