Instructor, Emergency Medicine
Developing and authenticating an electronic health record-based report card for assessing residents' clinical performance
AEM EDUCATION AND TRAINING
2023; 7 (2): e10851
The electronic health record (EHR) is frequently identified as a source of assessment data regarding residents' clinical performance. To better understand how to harness EHR data for education purposes, the authors developed and authenticated a prototype resident report card. This report card used EHR data exclusively and was authenticated with various stakeholders to understand individuals' reactions to and interpretations of EHR data when presented in this way.Using principles derived from participatory action research and participatory evaluation, this study brought together residents, faculty, a program director, and medical education researchers (n = 19) to develop and authenticate a prototype report card for residents. From February to September 2019, participants were invited to take part in a semistructured interview that explored their reactions to the prototype and provided insights about how they interpreted the EHR data.Our results highlighted three themes: data representation, data value, and data literacy. Participants varied in terms of the best way to present the various EHR metrics and felt pertinent contextual information should be included. All participants agreed that the EHR data presented were valuable, but most had concerns about using it for assessment. Finally, participants had difficulties interpreting the data, suggesting that these data could be presented more intuitively and that residents and faculty may require additional training to fully appreciate these EHR data.This work demonstrated how EHR data could be used to assess residents' clinical performance, but it also identified areas that warrant further consideration, especially pertaining to data representation and subsequent interpretation. Providing residents and faculty with EHR data in a resident report card was viewed as most valuable when used to guide feedback and coaching conversations.
View details for DOI 10.1002/aet2.10851
View details for Web of Science ID 000959757100001
View details for PubMedID 37008653
View details for PubMedCentralID PMC10061574
Supportive and collaborative interdependence: Distinguishing residents' contributions within healthcare teams.
Individual assessments disregard team contributions, while team assessments disregard an individual's contributions. Interdependence has been put forth as a conceptual bridge between our educational traditions of assessing individual performance and our imminent challenge of assessing team-based performance without losing sight of the individual. The purpose of this study was to develop a more refined conceptualization of interdependence to inform the creation of measures that can assess the interdependence of residents within healthcare teams.Following a constructivist grounded theory approach, we conducted 49 semi-structured interviews with various members of healthcare teams (e.g., physicians, nurses, pharmacists, social workers, patients) across two different clinical specialties - Emergency Medicine and Pediatrics - at two separate sites. Data collection and analysis occurred iteratively. Constant comparative inductive analysis was used and coding consisted of three stages: initial, focused, and theoretical.We asked participants to reflect upon interdependence and describe how it exists in their clinical setting. All participants acknowledged the existence of interdependence, but they did not view it as part of a linear spectrum where interdependence becomes independence. Our analysis refined the conceptualization of interdependence to include two types: supportive and collaborative. Supportive interdependence occurs within healthcare teams when one member demonstrates insufficient expertise to perform within their scope of practice. Collaborative interdependence, on the other hand, was not triggered by lack of experience/expertise within an individual's scope of practice, but rather recognition that patient care requires contributions from other team members.In order to assess a team's collective performance without losing sight of the individual, we need to capture interdependent performances and characterize the nature of such interdependence. Moving away from a linear trajectory where independence is seen as the end goal can also help support efforts to measure an individual's competence as an interdependent member of a healthcare team.
View details for DOI 10.1111/medu.15064
View details for PubMedID 36822577
Facilitating Residents' Understanding of Electronic Health Record Report Card Data Using Faculty Feedback and Coaching.
Academic medicine : journal of the Association of American Medical Colleges
Feedback continues to present a challenge for competency-based medical education (CBME). Clear, consistent, and credible feedback is vital to supporting one's ongoing development, yet it can be difficult to gather clinical performance data about residents. This study sought to determine whether providing residents with electronic health record (EHR) based report cards, as well as an opportunity to discuss these data with faculty trained using the R2C2 model, can help residents understand and interpret their clinical performance metrics.Using action research methodology, the author team collected EHR data from July 2017 to February 2020, for all residents (n = 21) in 1 5-year Emergency Medicine program and created personalized report cards for each resident. During October 6-17, 2020, 8 out of 17 eligible residents agreed to have their feedback conversations recorded and participate in a subsequent interview with a non-physician member of the research team. Data were analyzed using thematic analysis and the authors used inductive analysis to identify themes in the data.In analyzing both the feedback conversations as well as the individual interviews with faculty and residents, the authors identified 2 main themes: (1) Reactions and responses to receiving personalized EHR data and (2) The value of EHR data for assessment and feedback purposes. All participants believed that EHR data metrics are useful for prompting self-reflection and many pointed to their utility in providing suggestions for actionable changes in their clinical practice. For faculty, having a tool through which underperforming residents can be shown "objective" data about their clinical performance helps underscore the need for improvement, particularly when residents are resistant.The EHR is a valuable source of educational data and this study demonstrates one of the many thoughtful ways it can be used for assessment and feedback purposes.
View details for DOI 10.1097/ACM.0000000000004900
View details for PubMedID 35947480
Considering the interdependence of clinical performance: implications for assessment and entrustment.
Our ability to assess independent trainee performance is a key element of competency-based medical education (CBME). In workplace-based clinical settings, however, the performance of a trainee can be deeply entangled with others on the team. This presents a fundamental challenge, given the need to assess and entrust trainees based on the evolution of their independent clinical performance. The purpose of this study, therefore, was to understand what faculty members and senior postgraduate trainees believe constitutes independent performance in a variety of clinical specialty contexts.Following constructivist grounded theory, and using both purposive and theoretical sampling, we conducted individual interviews with 11 clinical teaching faculty members and 10 senior trainees (postgraduate year 4/5) across 12 postgraduate specialties. Constant comparative inductive analysis was conducted. Return of findings was also carried out using one-to-one sessions with key informants and public presentations.Although some independent performances were described, participants spoke mostly about the exceptions to and disclaimers about these, elaborating their sense of the interdependence of trainee performances. Our analysis of these interdependence patterns identified multiple configurations of coupling, with the dominant being coupling of trainee and supervisor performance. We consider how the concept of coupling could advance workplace-based assessment efforts by supporting models that account for the collective dimensions of clinical performance.These findings call into question the assumption of independent performance, and offer an important step toward measuring coupled performance. An understanding of coupling can help both to better distinguish independent and interdependent performances, and to consider revising workplace-based assessment approaches for CBME.
View details for DOI 10.1111/medu.13588
View details for PubMedID 29676054
- Using electronic health record data to assess residents’ performance in the clinical workplace: The good, the bad, and the unthinkable Academic Medicine 2018
The fundamentals of artificial intelligence in medical education research: AMEE Guide No. 156.
The use of Artificial Intelligence (AI) in medical education has the potential to facilitate complicated tasks and improve efficiency. For example, AI could help automate assessment of written responses, or provide feedback on medical image interpretations with excellent reliability. While applications of AI in learning, instruction, and assessment are growing, further exploration is still required. There exist few conceptual or methodological guides for medical educators wishing to evaluate or engage in AI research. In this guide, we aim to: 1) describe practical considerations involved in reading and conducting studies in medical education using AI, 2) define basic terminology and 3) identify which medical education problems and data are ideally-suited for using AI.
View details for DOI 10.1080/0142159X.2023.2180340
View details for PubMedID 36862064
Using Resident-Sensitive Quality Measures Derived From Electronic Health Record Data to Assess Residents' Performance in Pediatric Emergency Medicine
2023; 98 (3): 367-375
Traditional quality metrics do not adequately represent the clinical work done by residents and, thus, cannot be used to link residency training to health care quality. This study aimed to determine whether electronic health record (EHR) data can be used to meaningfully assess residents' clinical performance in pediatric emergency medicine using resident-sensitive quality measures (RSQMs).EHR data for asthma and bronchiolitis RSQMs from Cincinnati Children's Hospital Medical Center, a quaternary children's hospital, between July 1, 2017, and June 30, 2019, were analyzed by ranking residents based on composite scores calculated using raw, unadjusted, and case-mix adjusted latent score models, with lower percentiles indicating a lower quality of care and performance. Reliability and associations between the scores produced by the 3 scoring models were compared. Resident and patient characteristics associated with performance in the highest and lowest tertiles and changes in residents' rank after case-mix adjustments were also identified.274 residents and 1,891 individual encounters of bronchiolitis patients aged 0-1 as well as 270 residents and 1,752 individual encounters of asthmatic patients aged 2-21 were included in the analysis. The minimum reliability requirement to create a composite score was met for asthma data (α = 0.77), but not bronchiolitis (α = 0.17). The asthma composite scores showed high correlations ( r = 0.90-0.99) between raw, latent, and adjusted composite scores. After case-mix adjustments, residents' absolute percentile rank shifted on average 10 percentiles. Residents who dropped by 10 or more percentiles were likely to be more junior, saw fewer patients, cared for less acute and younger patients, or had patients with a longer emergency department stay.For some clinical areas, it is possible to use EHR data, adjusted for patient complexity, to meaningfully assess residents' clinical performance and identify opportunities for quality improvement.
View details for DOI 10.1097/ACM.0000000000005084
View details for Web of Science ID 000971604400022
View details for PubMedID 36351056
View details for PubMedCentralID PMC9944759
The Inconspicuous Learner Handover: An Exploratory Study of U.S. Emergency Medicine Program Directors' Perceptions of Learner Handovers from Medical School to Residency.
Teaching and learning in medicine
Phenomenon: Central to competency-based medical education is the need for a seamless developmental continuum of training and practice. Trainees currently experience significant discontinuity in the transition from undergraduate (UME) to graduate medical education (GME). The learner handover is intended to smooth this transition, but little is known about how well this is working from the GME perspective. In an attempt to gather preliminary evidence, this study explores U.S. program directors (PDs) perspective of the learner handover from UME to GME. Approach: Using exploratory qualitative methodology, we conducted semi-structured interviews with 12 Emergency Medicine PDs within the U.S. from October to November, 2020. We asked participants to describe their current perception of the learner handover from UME to GME. Then we performed thematic analysis using an inductive approach. Findings: We identified two main themes: The inconspicuous learner handover and barrier to creating a successful UME to GME learner handover. PDs described the current state of the learner handover as "nonexistent," yet acknowledged that information is transmitted from UME to GME. Participants also highlighted key challenges preventing a successful learner handover from UME to GME. These included: conflicting expectations, issues of trust and transparency, and a dearth of assessment data to actually hand over. Insights: PDs highlight the inconspicuous nature of learner handovers, suggesting that assessment information is not shared in the way it should be in the transition from UME to GME. Challenges with the learner handover demonstrate a lack of trust, transparency, and explicit communication between UME and GME. Our findings can inform how national organizations establish a unified approach to transmitting growth-oriented assessment data and formalize transparent learner handovers from UME to GME.
View details for DOI 10.1080/10401334.2023.2178438
View details for PubMedID 36794363
Filling the Core EPA 10 assessment void: A framework for individual assessment of Core Entrustable professional activity 10 competencies in medical students.
AEM education and training
2022; 6 (6): e10787
Objectives: The goal of this study was to develop and evaluate a novel curriculum and assessment tool for Core Entrustable Professional Activity (EPA) 10 competencies and entrustment scoring in a cohort of medical students in their emergency medicine (EM) clerkship using a framework of individualized, ad hoc, formative assessment. Core EPA 10 is an observable workplace-based activity for graduating medical students to recognize a patient requiring urgent or emergent care and initiate evaluation and management.Methods: This is a prospective, pretest-posttest study of medical students during their EM clerkship. Using the Thomas and Kern framework, we created a curriculum of simulation cases about chest pain/cardiac arrest and respiratory distress, which included novel assessment checklists, and instructional videos about recognizing and managing emergencies. Students were individually pretested on EPA 10 competencies using the simulation cases. Two raters scored students using standardized checklists. Students then watched instructional videos, underwent a posttest with the simulation cases, and were scored again by the two raters using the checklists. Differences between pretest and posttest scores were analyzed using paired t-tests and Wilcoxon signed-rank tests.Results: Seventy-three out of 85 (86%) students completed the curriculum. Mean scores from pretest to final posttest in the chest pain/cardiac arrest and respiratory distress cases significantly improved from 14.8/19 (SD 1.91), to 17.1/19 (SD=1.00), t(68)=10.56, p<0.001, and 8.5/13 (SD 1.79), to 11.1/13(SD 0.89), t(67)=11.15, p<0.001, respectively. The kappa coefficients were 0.909 (n=2698, p<0.001) and 0.931 (n=1872, p<0.001). Median modified Chen entrustment scores improved from 1b (i.e., "Watch me do this") to 2b (i.e., "I'll watch you") for the chest pain/cardiac arrest case (p<0.001) and 1b/2a (i.e., "Watch me do this"/ "Let's do this together") to 3a (i.e. "You go ahead, and I'll double-check all of your findings") for the respiratory distress case (p<0.001).Conclusion: A new directed curriculum of standardized simulation cases and asynchronous instructional videos improved medical student performance in EPA 10 competencies and entrustment scores. This study provides a curricular framework to support formative individualized assessments for EPA 10.
View details for DOI 10.1002/aet2.10787
View details for PubMedID 36389650
"Faces on a screen": A qualitative study of the virtual and in-person conference experience.
AEM education and training
2022; 6 (6): e10827
The strengths and weaknesses of virtual and in-person formats within continuing professional development (CPD) are incompletely understood. This study sought to explore attendees' perspectives across multiple specialties regarding benefits and limitations of conference formats and strategies for successful virtual and hybrid (i.e., in-person conferences with a virtual option) conferences.From December 2020 to January 2021, semistructured interviews were conducted with participants who attended both virtual and in-person CPD conferences. Purposive sampling was utilized to ensure diverse representation of gender, years in practice, location, academic rank, specialty, and practice type. Multiple specialties were intentionally sought to better understand the broader experience among physicians in general, rather than among a specific specialty. Using modified grounded theory approach with a constructivist-interpretivist paradigm, two investigators independently analyzed all interview transcripts. Discrepancies were resolved by in-depth discussion and negotiated consensus.Twenty-six individuals across 16 different specialties were interviewed. We identified three overarching concepts: motivations to attend conferences, benefits and limitations of different conference formats, and strategies to optimize virtual and hybrid conferences. Specific motivators included both professional and personal factors. Benefits of in person included networking/community, immersion, and wellness, while the major limitation was integration with personal life. Benefits of virtual were flexibility, accessibility, and incorporation of technology, while limitations included technical challenges, distractions, limitations for tactile learning, and communication/connection. Benefits of hybrid included more options for access, while limitations included challenges with synchrony of formats and dilution of experiences. Strategies to improve virtual/hybrid conferences included optimizing technology/production, facilitating networking and engagement, and deliberate selection of content.This study identified several benefits and limitations of each medium as well as strategies to optimize virtual and hybrid CPD conferences. This may help inform future CPD conference planning for both attendees and conference planners alike.
View details for DOI 10.1002/aet2.10827
View details for PubMedID 36562023
View details for PubMedCentralID PMC9763964
The next generation of researchers: One-year outcome data from the SAEM Advanced Research Methodology Evaluation and Design in Medical Education (ARMED MedEd) program
AEM EDUCATION AND TRAINING
2022; 6 (6)
View details for DOI 10.1002/aet2.10818
View details for Web of Science ID 000887953700001
A lasting impact? Exploring the immediate and longitudinal impact of an emergency department service learning help desk program
AEM EDUCATION AND TRAINING
2022; 6 (3)
View details for DOI 10.1002/aet2.10760
View details for Web of Science ID 000808017100001
Statistical points and pitfalls: growth modeling
PERSPECTIVES ON MEDICAL EDUCATION
2022; 11 (2): 104-107
View details for DOI 10.1007/s40037-022-00703-1
View details for Web of Science ID 000769840500001
View details for PubMedID 35294733
Who's on your team? Specialty identity and inter-physician conflict during admissions
2022; 56 (6): 625-633
Despite the implementation of professionalism curricula and standardised communication tools, inter-physician conflict persists. In particular, the interface between emergency medicine (EM) and internal medicine (IM) has long been recognised as a source of conflict. The social nuances of this conflict remain underexplored, limiting educators' ability to comprehensively address these issues in the clinical learning environment. Thus, the authors explored EM and IM physicians' experiences with negotiating hospital admissions to better understand the social dynamics that contribute to inter-physician conflict and provide foundational guidance for communication best practices.Using a constructivist grounded theory (CGT) approach, the authors conducted 18 semi-structured interviews between June and October 2020 with EM and IM physicians involved in conversations regarding admissions (CRAs). They asked participants to describe the social exchanges that influenced these conversations and to reflect on their experiences with inter-physician conflict. Data collection and analysis occurred iteratively. The relationships between the codes were discussed by the research team with the goal of developing conceptual connections between the emergent themes.Participants described how their approaches to CRAs were shaped by their specialty identity, and how allegiance to members of their group contributed to interpersonal conflict. This conflict was further promoted by a mutual sense of disempowerment within the organisation, misaligned expectations, and a desire to promote their group's prerogatives. Conflict was mitigated when patient care experiences fostered cross-specialty team formation and collaboration that dissolved traditional group boundaries.Conflict between EM and IM physicians during CRAs was primed by participants' specialty identities, their power struggles within the broader organisation, and their sense of duty to their own specialty. However, formation of collaborative inter-specialty physician teams and expansion of identity to include colleagues from other specialties can mitigate inter-physician conflict.
View details for DOI 10.1111/medu.14715
View details for Web of Science ID 000743339000001
View details for PubMedID 34942027
Assessment of Entrustable Professional Activities Using a Web-Based Simulation Platform During Transition to Emergency Medicine Residency: Mixed Methods Pilot Study.
JMIR medical education
2021; 7 (4): e32356
BACKGROUND: The 13 core entrustable professional activities (EPAs) are key competency-based learning outcomes in the transition from undergraduate to graduate medical education in the United States. Five of these EPAs (EPA2: prioritizing differentials, EPA3: recommending and interpreting tests, EPA4: entering orders and prescriptions, EPA5: documenting clinical encounters, and EPA10: recognizing urgent and emergent conditions) are uniquely suited for web-based assessment.OBJECTIVE: In this pilot study, we created cases on a web-based simulation platform for the diagnostic assessment of these EPAs and examined the feasibility and acceptability of the platform.METHODS: Four simulation cases underwent 3 rounds of consensus panels and pilot testing. Incoming emergency medicine interns (N=15) completed all cases. A maximum of 4 "look for" statements, which encompassed specific EPAs, were generated for each participant: (1) performing harmful or missing actions, (2) narrowing differential or wrong final diagnosis, (3) errors in documentation, and (4) lack of recognition and stabilization of urgent diagnoses. Finally, we interviewed a sample of interns (n=5) and residency leadership (n=5) and analyzed the responses using thematic analysis.RESULTS: All participants had at least one missing critical action, and 40% (6/15) of the participants performed at least one harmful action across all 4 cases. The final diagnosis was not included in the differential diagnosis in more than half of the assessments (8/15, 54%). Other errors included selecting incorrect documentation passages (6/15, 40%) and indiscriminately applying oxygen (9/15, 60%). The interview themes included psychological safety of the interface, ability to assess learning, and fidelity of cases. The most valuable feature cited was the ability to place orders in a realistic electronic medical record interface.CONCLUSIONS: This study demonstrates the feasibility and acceptability of a web-based platform for diagnostic assessment of specific EPAs. The approach rapidly identifies potential areas of concern for incoming interns using an asynchronous format, provides feedback in a manner appreciated by residency leadership, and informs individualized learning plans.
View details for DOI 10.2196/32356
View details for PubMedID 34787582
Defining and Adopting Clinical Performance Measures in Graduate Medical Education: Where Are We Now and Where Are We Going?
Academic medicine : journal of the Association of American Medical Colleges
Assessment and evaluation of trainees' clinical performance measures is needed to ensure safe, high-quality patient care. These measures also aid in the development of reflective, high-performing clinicians and hold graduate medical education (GME) accountable to the public. While clinical performance measures hold great potential, challenges of defining, extracting, and measuring clinical performance in this way hinder their use for educational and quality improvement purposes. This article provides a way forward by identifying and articulating how clinical performance measures can be used to enhance GME by linking educational objectives with relevant clinical outcomes. The authors explore four key challenges: defining as well as measuring clinical performance measures, using electronic health record and clinical registry data to capture clinical performance, and bridging silos of medical education and health care quality improvement. The authors also propose solutions to showcase the value of clinical performance measures and conclude with a research and implementation agenda. Developing a common taxonomy of uniform specialty-specific clinical performance measures, linking these measures to large-scale GME databases, and applying both quantitative and qualitative methods to create a rich understanding of how GME affects quality of care and patient outcomes is important, the authors argue. The focus of this article is primarily GME, yet similar challenges and solutions will be applicable to other areas of medical and health professions education as well.
View details for DOI 10.1097/ACM.0000000000002620
View details for PubMedID 30720528
The Birth of a Return to Work Policy for New Resident Parents in Emergency Medicine.
Academic emergency medicine : official journal of the Society for Academic Emergency Medicine
OBJECTIVE: With the rising number of female physicians, there will be more children than ever born in residency and the current system is inadequate to handle this increase in new resident parents. Residency is stressful and rigorous in isolation, let alone when pregnant or with a new child. Policies that ease these stressful transitions are generally either insufficient or do not exist. Therefore, we created a comprehensive Return to Work Policy for resident parents and piloted its implementation. Our policy aims to: 1) establish a clear, shared understanding of the regulatory and training requirements as they pertain to parental leave, 2) facilitate a smooth transition for new parents returning back to work, and 3) summarize the local and institutional resources available for both males and females during residency training.METHOD: In Fall 2017, a task force was convened to draft a Return to Work Policy for New Resident Parents. The task force included 9 key stakeholders (i.e., residents, faculty, and administration) at our institution and was made up of 3 Graduate Medical Education (GME) Program Directors, a Vice Chair of Education, a Designated Institutional Official (DIO), a Chief Resident, and 3 members of our academic department's Faculty Affairs Committee. The task force was selected because of individual expertise in gender equity issues, mentorship of resident parents, GME, and departmental administration.RESULTS: After development, the policy was piloted from November 2017 to June 2018. Our pilot implementation period included 7 new resident parents. All of these residents received schedules that met the return to work scheduling terms of our Return to Work Policy including no overnight shifts, no sick call, no more than 3 shifts in a row. Of equal importance, throughout our pilot, the emergency department schedules at all of our clinical sites remained fully staffed and our sick call pool was unaffected.CONCLUSION: Our Return to Work Policy for New Resident Parents provides a comprehensive guide to training requirements and family leave policies, an overview of available resources, and a scheduling framework that makes for a smooth transition back to clinical duties. This article is protected by copyright. All rights reserved.
View details for PubMedID 30636353
- Examining Differential Rater Functioning using a Between-Subgroup Outfit Approach Journal of Educational Measurement 2019
- You want me to assess what? Faculty perceptions of assessing residents from outside their specialty Academic Medicine 2019
A Call to Investigate the Relationship Between Education and Health Outcomes Using Big Data
2018; 93 (6): 829–32
There exists an assumption that improving medical education will improve patient care. While seemingly logical, this premise has rarely been investigated. In this Invited Commentary, the authors propose the use of big data to test this assumption. The authors present a few example research studies linking education and patient care outcomes and argue that using big data may more easily facilitate the process needed to investigate this assumption. The authors also propose that collaboration is needed to link educational and health care data. They then introduce a grassroots initiative, inclusive of universities in one Canadian province and national licensing organizations that are working together to collect, organize, link, and analyze big data to study the relationship between pedagogical approaches to medical training and patient care outcomes. While the authors acknowledge the possible challenges and issues associated with harnessing big data, they believe that the benefits supersede these. There is a need for medical education research to go beyond the outcomes of training to study practice and clinical outcomes as well. Without a coordinated effort to harness big data, policy makers, regulators, medical educators, and researchers are left with sometimes costly guesses and assumptions about what works and what does not. As the social, time, and financial investments in medical education continue to increase, it is imperative to understand the relationship between education and health outcomes.
View details for DOI 10.1097/ACM.0000000000002217
View details for Web of Science ID 000435369500022
View details for PubMedID 29538109
Learning Analytics in Medical Education Assessment: The Past, the Present, and the Future.
AEM education and training
2018; 2 (2): 178-187
With the implementation of competency-based medical education (CBME) in emergency medicine, residency programs will amass substantial amounts of qualitative and quantitative data about trainees' performances. This increased volume of data will challenge traditional processes for assessing trainees and remediating training deficiencies. At the intersection of trainee performance data and statistical modeling lies the field of medical learning analytics. At a local training program level, learning analytics has the potential to assist program directors and competency committees with interpreting assessment data to inform decision making. On a broader level, learning analytics can be used to explore system questions and identify problems that may impact our educational programs. Scholars outside of health professions education have been exploring the use of learning analytics for years and their theories and applications have the potential to inform our implementation of CBME. The purpose of this review is to characterize the methodologies of learning analytics and explore their potential to guide new forms of assessment within medical education.
View details for DOI 10.1002/aet2.10087
View details for PubMedID 30051086
View details for PubMedCentralID PMC6001721
Using electronic health record data to assess emergency medicine trainees independent and interdependent performance: a qualitative perspective on measuring what matters.
Canadian Journal of Emergency Medicine
View details for DOI 10.1017/cem.2018.336
- Reliability and validity evidence for the quality of assessment for learning (QuAL) score Academic Emergency Medicine 2018
Quality Evaluation Scores are no more Reliable than Gestalt in Evaluating the Quality of Emergency Medicine Blogs: A METRIQ Study
TEACHING AND LEARNING IN MEDICINE
2018; 30 (3): 294–302
Construct: We investigated the quality of emergency medicine (EM) blogs as educational resources.Online medical education resources such as blogs are increasingly used by EM trainees and clinicians. However, quality evaluations of these resources using gestalt are unreliable. We investigated the reliability of two previously derived quality evaluation instruments for blogs.Sixty English-language EM websites that published clinically oriented blog posts between January 1 and February 24, 2016, were identified. A random number generator selected 10 websites, and the 2 most recent clinically oriented blog posts from each site were evaluated using gestalt, the Academic Life in Emergency Medicine (ALiEM) Approved Instructional Resources (AIR) score, and the Medical Education Translational Resources: Impact and Quality (METRIQ-8) score, by a sample of medical students, EM residents, and EM attendings. Each rater evaluated all 20 blog posts with gestalt and 15 of the 20 blog posts with the ALiEM AIR and METRIQ-8 scores. Pearson's correlations were calculated between the average scores for each metric. Single-measure intraclass correlation coefficients (ICCs) evaluated the reliability of each instrument.Our study included 121 medical students, 88 EM residents, and 100 EM attendings who completed ratings. The average gestalt rating of each blog post correlated strongly with the average scores for ALiEM AIR (r = .94) and METRIQ-8 (r = .91). Single-measure ICCs were fair for gestalt (0.37, IQR 0.25-0.56), ALiEM AIR (0.41, IQR 0.29-0.60) and METRIQ-8 (0.40, IQR 0.28-0.59).The average scores of each blog post correlated strongly with gestalt ratings. However, neither ALiEM AIR nor METRIQ-8 showed higher reliability than gestalt. Improved reliability may be possible through rater training and instrument refinement.
View details for DOI 10.1080/10401334.2017.1414609
View details for Web of Science ID 000435016500007
View details for PubMedID 29381099
Mixed Messages or Miscommunication? Investigating the Relationship Between Assessors' Workplace-Based Assessment Scores and Written Comments
2017; 92 (12): 1774–79
The shift toward broader, programmatic assessment has revolutionized the approaches that many take in assessing medical competence. To understand the association between quantitative and qualitative evaluations, the authors explored the relationships that exist among assessors' checklist scores, task ratings, global ratings, and written comments.The authors collected and analyzed, using regression analyses, data from the McMaster Modular Assessment Program. The data were from emergency medicine residents in their first or second year of postgraduate training from 2012 through 2014. Additionally, using content analysis, the authors analyzed narrative comments corresponding to the "done" and "done, but needs attention" checklist score options.The regression analyses revealed that the task ratings, provided by faculty assessors, are associated with the use of the "done, but needs attention" checklist score option. Analyses also identified that the "done, but needs attention" option is associated with a narrative comment that is balanced, providing both strengths and areas for improvement. Analysis of qualitative comments revealed differences in the type of comments provided to higher- and lower-performing residents.This study highlights some of the relationships that exist among checklist scores, rating scales, and written comments. The findings highlight that task ratings are associated with checklist options while global ratings are not. Furthermore, analysis of written comments supports the notion of a "hidden code" used to communicate assessors' evaluation of medical competence, especially when communicating areas for improvement or concern. This study has implications for how individuals should interpret information obtained from qualitative assessments.
View details for DOI 10.1097/ACM.0000000000001743
View details for Web of Science ID 000419151600038
View details for PubMedID 28562452
Comparison of Simulation-based Resuscitation Performance Assessments With In-training Evaluation Reports in Emergency Medicine Residents: A Canadian Multicenter Study.
AEM education and training
2017; 1 (4): 293-300
Simulation stands to serve an important role in modern competency-based programs of assessment in postgraduate medical education. Our objective was to compare the performance of individual emergency medicine (EM) residents in a simulation-based resuscitation objective structured clinical examination (OSCE) using the Queen's Simulation Assessment Tool (QSAT), with portfolio assessment of clinical encounters using a modified in-training evaluation report (ITER) to understand in greater detail the inferences that may be drawn from a simulation-based OSCE assessment.A prospective observational study was employed to explore the use of a multicenter simulation-based OSCE for evaluation of resuscitation competence. EM residents from five Canadian academic sites participated in the OSCE. Video-recorded performances were scored by blinded raters using the scenario-specific QSATs with domain-specific anchored scores (primary assessment, diagnostic actions, therapeutic actions, communication) and a global assessment score (GAS). Residents' portfolios were evaluated using a modified ITER subdivided by CanMEDS roles (medical expert, communicator, collaborator, leader, health advocate, scholar, and professional) and a GAS. Correlational and regression analyses were performed comparing components of each of the assessment methods.Portfolio review and ITER scoring was performed for 79 residents participating in the simulation-based OSCE. There was a significant positive correlation between total OSCE and ITER scores (r = 0.341). The strongest correlations were found between ITER medical expert score and each of the OSCE GAS (r = 0.420), communication (r = 0.443), and therapeutic action (r = 0.484) domains. ITER medical expert was a significant predictor of OSCE total (p = 0.002). OSCE therapeutic action was a significant predictor of ITER total (p = 0.02).Simulation-based resuscitation OSCEs and portfolio assessment captured by ITERs appear to measure differing aspects of competence, with weak to moderate correlation between those measures of conceptually similar constructs. In a program of competency-based assessment of EM residents, a simulation-based OSCE using the QSAT shows promise as a tool for assessing medical expert and communicator roles.
View details for DOI 10.1002/aet2.10055
View details for PubMedID 30051047
View details for PubMedCentralID PMC6001706
Individual Gestalt Is Unreliable for the Evaluation of Quality in Medical Education Blogs: A METRIQ Study
ANNALS OF EMERGENCY MEDICINE
2017; 70 (3): 394–401
Open educational resources such as blogs are increasingly used for medical education. Gestalt is generally the evaluation method used for these resources; however, little information has been published on it. We aim to evaluate the reliability of gestalt in the assessment of emergency medicine blogs.We identified 60 English-language emergency medicine Web sites that posted clinically oriented blogs between January 1, 2016, and February 24, 2016. Ten Web sites were selected with a random-number generator. Medical students, emergency medicine residents, and emergency medicine attending physicians evaluated the 2 most recent clinical blog posts from each site for quality, using a 7-point Likert scale. The mean gestalt scores of each blog post were compared between groups with Pearson's correlations. Single and average measure intraclass correlation coefficients were calculated within groups. A generalizability study evaluated variance within gestalt and a decision study calculated the number of raters required to reliably (>0.8) estimate quality.One hundred twenty-one medical students, 88 residents, and 100 attending physicians (93.6% of enrolled participants) evaluated all 20 blog posts. Single-measure intraclass correlation coefficients within groups were fair to poor (0.36 to 0.40). Average-measure intraclass correlation coefficients were more reliable (0.811 to 0.840). Mean gestalt ratings by attending physicians correlated strongly with those by medical students (r=0.92) and residents (r=0.99). The generalizability coefficient was 0.91 for the complete data set. The decision study found that 42 gestalt ratings were required to reliably evaluate quality (>0.8).The mean gestalt quality ratings of blog posts between medical students, residents, and attending physicians correlate strongly, but individual ratings are unreliable. With sufficient raters, mean gestalt ratings provide a community standard for assessment.
View details for DOI 10.1016/j.annemergmed.2016.12.025
View details for Web of Science ID 000410255300022
View details for PubMedID 28262317
Hawks, Doves and Rasch decisions: Understanding the influence of different cycles of an OSCE on students' scores using Many Facet Rasch Modeling
2017; 39 (1): 92–99
OSCEs are commonly conducted in multiple cycles (different circuits, times, and locations), yet the potential for students' allocation to different OSCE cycles is rarely considered as a source of variance-perhaps in part because conventional psychometrics provide limited insight.We used Many Facet Rasch Modeling (MFRM) to estimate the influence of "examiner cohorts" (the combined influence of the examiners in the cycle to which each student was allocated) on students' scores within a fully nested multi-cycle OSCE.Observed average scores for examiners cycles varied by 8.6%, but model-adjusted estimates showed a smaller range of 4.4%. Most students' scores were only slightly altered by the model; the greatest score increase was 5.3%, and greatest score decrease was -3.6%, with 2 students passing who would have failed.Despite using 16 examiners per cycle, examiner variability did not completely counter-balance, resulting in an influence of OSCE cycles on students' scores. Assumptions were required for the MFRM analysis; innovative procedures to overcome these limitations and strengthen OSCEs are discussed.OSCE cycle allocation has the potential to exert a small but unfair influence on students' OSCE scores; these little-considered influences should challenge our assumptions and design of OSCEs.
View details for DOI 10.1080/0142159X.2017.1248916
View details for Web of Science ID 000393885800015
View details for PubMedID 27897083
- “It’s Complicated”: Understanding the Relationships Between Checklists, Rating Scales, and Written Comments in Workplace-Based Assessments Academic Medicine 2016; 91 (11)
- Competency-based simulation assessment of resuscitation skills in emergency medicine postgraduate trainees – A Canadian multi-centered study Canadian Medical Education Journal 2016; 7 (1)
Seeing Things Differently or Seeing Different Things? Exploring Raters' Associations of Noncognitive Attributes
2015; 90 (11): S50–S55
Raters represent a significant source of unexplained, and often undesired, variance in performance-based assessments. To better understand rater variance, this study investigated how various raters, observing the same performance, perceived relationships amongst different noncognitive attributes measured in performance assessments.Medical admissions data from a Multiple Mini-Interview (MMI) used at one Canadian medical school were collected and subsequently analyzed using the Many Facet Rasch Model (MFRM) and hierarchical clustering. This particular MMI consisted of eight stations. At each station a faculty member and an upper-year medical student rated applicants on various noncognitive attributes including communication, critical thinking, effectiveness, empathy, integrity, maturity, professionalism, and resolution.The Rasch analyses revealed differences between faculty and student raters across the eight different MMI stations. These analyses also identified that, at times, raters were unable to distinguish between the various noncognitive attributes. Hierarchical clustering highlighted differences in how faculty and student raters observed the various noncognitive attributes. Differences in how individual raters associated the various attributes within a station were also observed.The MFRM and hierarchical clustering helped to explain some of the variability associated with raters in a way that other measurement models are unable to capture. These findings highlight that differences in ratings may result from raters possessing different interpretations of an observed performance. This study has implications for developing more purposeful rater selection and rater profiling in performance-based assessments.
View details for DOI 10.1097/ACM.0000000000000902
View details for Web of Science ID 000375840200008
View details for PubMedID 26505102
Examiners and content and site: Oh My! A national organization's investigation of score variation in large-scale performance assessments
ADVANCES IN HEALTH SCIENCES EDUCATION
2015; 20 (3): 581–94
Examiner effects and content specificity are two well known sources of construct irrelevant variance that present great challenges in performance-based assessments. National medical organizations that are responsible for large-scale performance based assessments experience an additional challenge as they are responsible for administering qualification examinations to physician candidates at several locations and institutions. This study explores the impact of site location as a source of score variation in a large-scale national assessment used to measure the readiness of internationally educated physician candidates for residency programs. Data from the Medical Council of Canada's National Assessment Collaboration were analyzed using Hierarchical Linear Modeling and Rasch Analyses. Consistent with previous research, problematic variance due to examiner effects and content specificity was found. Additionally, site location was also identified as a potential source of construct irrelevant variance in examination scores.
View details for DOI 10.1007/s10459-014-9547-z
View details for Web of Science ID 000357644900002
View details for PubMedID 25164266
Cross-national trends in perceived school pressure by gender and age from 1994 to 2010
EUROPEAN JOURNAL OF PUBLIC HEALTH
2015; 25: 51–56
Pressure within school can be a critical component in understanding how the school experience influences young people's intellectual development, physical and mental health and future educational decisions.Data from five survey rounds (1993/1994, 1997/1998, 2001/2002, 2005/2006 and 2009/2010) were used to examine time-, age- and gender-related trends in the amounts of reported school pressure among 11-, 13- and 15-year-olds, in five different regions (North America, Great Britain, Eastern Europe, Nordic and Germanic countries).Across the regions the reported perceptions of school pressure did not change between 1994 and 2010, despite a temporary increase in 2002 and 2006. With the exception of children at 11 years of age, girls reported higher levels of school pressure than boys (Cohen's d from 0.12 to 0.58) and school pressure was higher in older age groups. These findings were consistent across countries. Regionally, children in North America reported the highest levels of school pressure, and students in the Germanic countries the lowest.Factors associated with child development and differences in societal expectations and structures, along with the possible, albeit, differential impact of the Programme for International Student Assessment (PISA), may partially explain the differences and trends found in school pressure. School pressure increases alongside the onset of adolescence and the shift from elementary school to the higher demanding expectations of secondary education. Time-related increases in school pressure occurred in the years following the release of the PISA results, and were larger in those regions in which results were less positive.
View details for DOI 10.1093/eurpub/ckv027
View details for Web of Science ID 000362971500013
View details for PubMedID 25805788
Development and testing of an objective structured clinical exam (OSCE) to assess socio-cultural dimensions of patient safety competency
BMJ QUALITY & SAFETY
2015; 24 (3): 188–94
Patient safety (PS) receives limited attention in health professional curricula. We developed and pilot tested four Objective Structured Clinical Examination (OSCE) stations intended to reflect socio-cultural dimensions in the Canadian Patient Safety Institute's Safety Competency Framework.18 third year undergraduate medical and nursing students at a Canadian University.OSCE cases were developed by faculty with clinical and PS expertise with assistance from expert facilitators from the Medical Council of Canada. Stations reflect domains in the Safety Competency Framework (ie, managing safety risks, culture of safety, communication). Stations were assessed by two clinical faculty members. Inter-rater reliability was examined using weighted κ values. Additional aspects of reliability and OSCE performance are reported.Assessors exhibited excellent agreement (weighted κ scores ranged from 0.74 to 0.82 for the four OSCE stations). Learners' scores varied across the four stations. Nursing students scored significantly lower (p<0.05) than medical students on three stations (nursing student mean scores=1.9, 1.9 and 2.7; medical student mean scores=2.8, 2.9 and 3.5 for stations 1, 2 and 3, respectively where 1=borderline unsatisfactory, 2=borderline satisfactory and 3=competence demonstrated). 7/18 students (39%) scored below 'borderline satisfactory' on one or more stations.Results show (1) four OSCE stations evaluating socio-cultural dimensions of PS achieved variation in scores and (2) performance on this OSCE can be evaluated with high reliability, suggesting a single assessor per station would be sufficient. Differences between nursing and medical student performance are interesting; however, it is unclear what factors explain these differences.
View details for DOI 10.1136/bmjqs-2014-003277
View details for Web of Science ID 000349721000005
View details for PubMedID 25398630
View details for PubMedCentralID PMC4345888
Examiners and content and site: Oh my! A national organization’s investigation of variation in performance-based assessments
Advances in Health Sciences Education,
2015; 20 (3)
View details for DOI 10.1007/s10459-014- 9547-z
Psychometric properties of the multiple mini-interview used for medical admissions: findings from generalizability and Rasch analyses
ADVANCES IN HEALTH SCIENCES EDUCATION
2014; 19 (1): 71–84
The multiple mini-interview (MMI) has become an increasingly popular admissions method for selecting prospective students into professional programs (e.g., medical school). The MMI uses a series of short, labour intensive simulation stations and scenario interviews to more effectively assess applicants' non-cognitive qualities such as empathy, critical thinking, integrity, and communication. MMI data from 455 medical school applicants were analyzed using: (1) Generalizability Theory to estimate the generalizability of the MMI and identify sources of error; and (2) the Many-Facet Rasch Model, to identify misfitting examinees, items and raters. Consistent with previous research, our results support the reliability of MMI process. However, it appears that the non-cognitive qualities are not being measured as unique constructs across stations.
View details for DOI 10.1007/s10459-013-9463-7
View details for Web of Science ID 000331630200007
View details for PubMedID 23709188
- Assessment of a master of education counselling application selection process using rasch analysis and generalizability theory Canadian Journal of Counselling and Psychotherapy 2014; 48 (2)
Understanding the complexities of validity using reflective practice
2014; 15 (4): 445–55
View details for DOI 10.1080/14623943.2014.900015
View details for Web of Science ID 000212785200003
Survey of northern informal and formal mental health practitioners
INTERNATIONAL JOURNAL OF CIRCUMPOLAR HEALTH
2013; 72: 135–41
This survey is part of a multi-year research study on informal and formal mental health support in northern Canada involving the use of qualitative and quantitative data collection and analysis methods in an effort to better understand mental health in a northern context.The main objective of the 3-year study was to document the situation of formal and informal helpers in providing mental health support in isolated northern communities in northern British Columbia, northern Alberta, Yukon, Northwest Territories and Nunavut. The intent of developing a survey was to include more participants in the research and access those working in small communities who would be concerned regarding confidentiality and anonymity due to their high profile within smaller populations.Based on the in-depth interviews from the qualitative phase of the project, the research team developed a survey that reflected the main themes found in the initial qualitative analysis. The on-line survey consisted of 26 questions, looking at basic demographic information and presenting lists of possible challenges, supports and client mental health issues for participants to prioritise.Thirty-two participants identified various challenges, supports and client issues relevant to their mental health support work. A vast majority of the respondents felt prepared for northern practice and had some level of formal education. Supports for longevity included team collaboration, knowledgeable supervisors, managers, leaders and more opportunities for formal education, specific training and continuity of care to support clients.For northern-based research in small communities, the development of a survey allowed more participants to join the larger study in a way that protected their identity and confidentiality. The results from the survey emphasise the need for team collaboration, interdisciplinary practice and working with community strengths as a way to sustain mental health support workers in the North.
View details for DOI 10.3402/ijch.v72i0.20962
View details for Web of Science ID 000325721900021
View details for PubMedID 23984276
View details for PubMedCentralID PMC3753122
- Reflections on using data visualization techniques to engage stakeholders Queen’s University Graduate Student Symposium Proceedings 2012