Academic Appointments

All Publications

  • Facilitating Residents' Understanding of Electronic Health Record Report Card Data Using Faculty Feedback and Coaching. Academic medicine : journal of the Association of American Medical Colleges Sebok-Syer, S. S., Shaw, J. M., Sedran, R., Shepherd, L., McConnell, A., Dukelow, A. M., Syer, M. D., Lingard, L. 2022


    Feedback continues to present a challenge for competency-based medical education (CBME). Clear, consistent, and credible feedback is vital to supporting one's ongoing development, yet it can be difficult to gather clinical performance data about residents. This study sought to determine whether providing residents with electronic health record (EHR) based report cards, as well as an opportunity to discuss these data with faculty trained using the R2C2 model, can help residents understand and interpret their clinical performance metrics.Using action research methodology, the author team collected EHR data from July 2017 to February 2020, for all residents (n = 21) in 1 5-year Emergency Medicine program and created personalized report cards for each resident. During October 6-17, 2020, 8 out of 17 eligible residents agreed to have their feedback conversations recorded and participate in a subsequent interview with a non-physician member of the research team. Data were analyzed using thematic analysis and the authors used inductive analysis to identify themes in the data.In analyzing both the feedback conversations as well as the individual interviews with faculty and residents, the authors identified 2 main themes: (1) Reactions and responses to receiving personalized EHR data and (2) The value of EHR data for assessment and feedback purposes. All participants believed that EHR data metrics are useful for prompting self-reflection and many pointed to their utility in providing suggestions for actionable changes in their clinical practice. For faculty, having a tool through which underperforming residents can be shown "objective" data about their clinical performance helps underscore the need for improvement, particularly when residents are resistant.The EHR is a valuable source of educational data and this study demonstrates one of the many thoughtful ways it can be used for assessment and feedback purposes.

    View details for DOI 10.1097/ACM.0000000000004900

    View details for PubMedID 35947480

  • Defining and Adopting Clinical Performance Measures in Graduate Medical Education: Where Are We Now and Where Are We Going? Academic medicine : journal of the Association of American Medical Colleges Smirnova, A., Sebok-Syer, S. S., Chahine, S., Kalet, A. L., Tamblyn, R., Lombarts, K. M., van der Vleuten, C. P., Schumacher, D. J. 2019


    Assessment and evaluation of trainees' clinical performance measures is needed to ensure safe, high-quality patient care. These measures also aid in the development of reflective, high-performing clinicians and hold graduate medical education (GME) accountable to the public. While clinical performance measures hold great potential, challenges of defining, extracting, and measuring clinical performance in this way hinder their use for educational and quality improvement purposes. This article provides a way forward by identifying and articulating how clinical performance measures can be used to enhance GME by linking educational objectives with relevant clinical outcomes. The authors explore four key challenges: defining as well as measuring clinical performance measures, using electronic health record and clinical registry data to capture clinical performance, and bridging silos of medical education and health care quality improvement. The authors also propose solutions to showcase the value of clinical performance measures and conclude with a research and implementation agenda. Developing a common taxonomy of uniform specialty-specific clinical performance measures, linking these measures to large-scale GME databases, and applying both quantitative and qualitative methods to create a rich understanding of how GME affects quality of care and patient outcomes is important, the authors argue. The focus of this article is primarily GME, yet similar challenges and solutions will be applicable to other areas of medical and health professions education as well.

    View details for DOI 10.1097/ACM.0000000000002620

    View details for PubMedID 30720528

  • Considering the interdependence of clinical performance: implications for assessment and entrustment. Medical education Sebok-Syer, S. S., Chahine, S., Watling, C. J., Goldszmidt, M., Cristancho, S., Lingard, L. 2018


    Our ability to assess independent trainee performance is a key element of competency-based medical education (CBME). In workplace-based clinical settings, however, the performance of a trainee can be deeply entangled with others on the team. This presents a fundamental challenge, given the need to assess and entrust trainees based on the evolution of their independent clinical performance. The purpose of this study, therefore, was to understand what faculty members and senior postgraduate trainees believe constitutes independent performance in a variety of clinical specialty contexts.Following constructivist grounded theory, and using both purposive and theoretical sampling, we conducted individual interviews with 11 clinical teaching faculty members and 10 senior trainees (postgraduate year 4/5) across 12 postgraduate specialties. Constant comparative inductive analysis was conducted. Return of findings was also carried out using one-to-one sessions with key informants and public presentations.Although some independent performances were described, participants spoke mostly about the exceptions to and disclaimers about these, elaborating their sense of the interdependence of trainee performances. Our analysis of these interdependence patterns identified multiple configurations of coupling, with the dominant being coupling of trainee and supervisor performance. We consider how the concept of coupling could advance workplace-based assessment efforts by supporting models that account for the collective dimensions of clinical performance.These findings call into question the assumption of independent performance, and offer an important step toward measuring coupled performance. An understanding of coupling can help both to better distinguish independent and interdependent performances, and to consider revising workplace-based assessment approaches for CBME.

    View details for DOI 10.1111/medu.13588

    View details for PubMedID 29676054

  • Learning Analytics in Medical Education Assessment: The Past, the Present, and the Future. AEM education and training Chan, T., Sebok-Syer, S., Thoma, B., Wise, A., Sherbino, J., Pusic, M. 2018; 2 (2): 178-187


    With the implementation of competency-based medical education (CBME) in emergency medicine, residency programs will amass substantial amounts of qualitative and quantitative data about trainees' performances. This increased volume of data will challenge traditional processes for assessing trainees and remediating training deficiencies. At the intersection of trainee performance data and statistical modeling lies the field of medical learning analytics. At a local training program level, learning analytics has the potential to assist program directors and competency committees with interpreting assessment data to inform decision making. On a broader level, learning analytics can be used to explore system questions and identify problems that may impact our educational programs. Scholars outside of health professions education have been exploring the use of learning analytics for years and their theories and applications have the potential to inform our implementation of CBME. The purpose of this review is to characterize the methodologies of learning analytics and explore their potential to guide new forms of assessment within medical education.

    View details for DOI 10.1002/aet2.10087

    View details for PubMedID 30051086

    View details for PubMedCentralID PMC6001721

  • Using electronic health record data to assess residents’ performance in the clinical workplace: The good, the bad, and the unthinkable Academic Medicine Sebok-Syer, S. S., Goldszmidt, M. A., Watling, C. J., Tamblyn, R., Chahine, S., Venance, S. V., Lingard, L. A. 2018
  • Mixed Messages or Miscommunication? Investigating the Relationship Between Assessors' Workplace-Based Assessment Scores and Written Comments ACADEMIC MEDICINE Sebok-Syer, S. S., Klinger, D. A., Sherbino, J., Chan, T. M. 2017; 92 (12): 1774–79


    The shift toward broader, programmatic assessment has revolutionized the approaches that many take in assessing medical competence. To understand the association between quantitative and qualitative evaluations, the authors explored the relationships that exist among assessors' checklist scores, task ratings, global ratings, and written comments.The authors collected and analyzed, using regression analyses, data from the McMaster Modular Assessment Program. The data were from emergency medicine residents in their first or second year of postgraduate training from 2012 through 2014. Additionally, using content analysis, the authors analyzed narrative comments corresponding to the "done" and "done, but needs attention" checklist score options.The regression analyses revealed that the task ratings, provided by faculty assessors, are associated with the use of the "done, but needs attention" checklist score option. Analyses also identified that the "done, but needs attention" option is associated with a narrative comment that is balanced, providing both strengths and areas for improvement. Analysis of qualitative comments revealed differences in the type of comments provided to higher- and lower-performing residents.This study highlights some of the relationships that exist among checklist scores, rating scales, and written comments. The findings highlight that task ratings are associated with checklist options while global ratings are not. Furthermore, analysis of written comments supports the notion of a "hidden code" used to communicate assessors' evaluation of medical competence, especially when communicating areas for improvement or concern. This study has implications for how individuals should interpret information obtained from qualitative assessments.

    View details for DOI 10.1097/ACM.0000000000001743

    View details for Web of Science ID 000419151600038

    View details for PubMedID 28562452

  • A lasting impact? Exploring the immediate and longitudinal impact of an emergency department service learning help desk program AEM EDUCATION AND TRAINING Cohen, A., Hu, S., Bellon, M., Wang, N., Sebok-Syer, S. S. 2022; 6 (3)

    View details for DOI 10.1002/aet2.10760

    View details for Web of Science ID 000808017100001

  • Statistical points and pitfalls: growth modeling PERSPECTIVES ON MEDICAL EDUCATION Boscardin, C. K., Sebok-Syer, S. S., Pusic, M. 2022; 11 (2): 104-107

    View details for DOI 10.1007/s40037-022-00703-1

    View details for Web of Science ID 000769840500001

    View details for PubMedID 35294733

  • Who's on your team? Specialty identity and inter-physician conflict during admissions MEDICAL EDUCATION Schrepel, C., Amick, A. E., Bann, M., Watsjold, B., Jauregui, J., Ilgen, J. S., Sebok-Syer, S. S. 2022; 56 (6): 625-633


    Despite the implementation of professionalism curricula and standardised communication tools, inter-physician conflict persists. In particular, the interface between emergency medicine (EM) and internal medicine (IM) has long been recognised as a source of conflict. The social nuances of this conflict remain underexplored, limiting educators' ability to comprehensively address these issues in the clinical learning environment. Thus, the authors explored EM and IM physicians' experiences with negotiating hospital admissions to better understand the social dynamics that contribute to inter-physician conflict and provide foundational guidance for communication best practices.Using a constructivist grounded theory (CGT) approach, the authors conducted 18 semi-structured interviews between June and October 2020 with EM and IM physicians involved in conversations regarding admissions (CRAs). They asked participants to describe the social exchanges that influenced these conversations and to reflect on their experiences with inter-physician conflict. Data collection and analysis occurred iteratively. The relationships between the codes were discussed by the research team with the goal of developing conceptual connections between the emergent themes.Participants described how their approaches to CRAs were shaped by their specialty identity, and how allegiance to members of their group contributed to interpersonal conflict. This conflict was further promoted by a mutual sense of disempowerment within the organisation, misaligned expectations, and a desire to promote their group's prerogatives. Conflict was mitigated when patient care experiences fostered cross-specialty team formation and collaboration that dissolved traditional group boundaries.Conflict between EM and IM physicians during CRAs was primed by participants' specialty identities, their power struggles within the broader organisation, and their sense of duty to their own specialty. However, formation of collaborative inter-specialty physician teams and expansion of identity to include colleagues from other specialties can mitigate inter-physician conflict.

    View details for DOI 10.1111/medu.14715

    View details for Web of Science ID 000743339000001

    View details for PubMedID 34942027

  • Assessment of Entrustable Professional Activities Using a Web-Based Simulation Platform During Transition to Emergency Medicine Residency: Mixed Methods Pilot Study. JMIR medical education Peng, C. R., Schertzer, K. A., Caretta-Weyer, H. A., Sebok-Syer, S. S., Lu, W., Tansomboon, C., Gisondi, M. A. 2021; 7 (4): e32356


    BACKGROUND: The 13 core entrustable professional activities (EPAs) are key competency-based learning outcomes in the transition from undergraduate to graduate medical education in the United States. Five of these EPAs (EPA2: prioritizing differentials, EPA3: recommending and interpreting tests, EPA4: entering orders and prescriptions, EPA5: documenting clinical encounters, and EPA10: recognizing urgent and emergent conditions) are uniquely suited for web-based assessment.OBJECTIVE: In this pilot study, we created cases on a web-based simulation platform for the diagnostic assessment of these EPAs and examined the feasibility and acceptability of the platform.METHODS: Four simulation cases underwent 3 rounds of consensus panels and pilot testing. Incoming emergency medicine interns (N=15) completed all cases. A maximum of 4 "look for" statements, which encompassed specific EPAs, were generated for each participant: (1) performing harmful or missing actions, (2) narrowing differential or wrong final diagnosis, (3) errors in documentation, and (4) lack of recognition and stabilization of urgent diagnoses. Finally, we interviewed a sample of interns (n=5) and residency leadership (n=5) and analyzed the responses using thematic analysis.RESULTS: All participants had at least one missing critical action, and 40% (6/15) of the participants performed at least one harmful action across all 4 cases. The final diagnosis was not included in the differential diagnosis in more than half of the assessments (8/15, 54%). Other errors included selecting incorrect documentation passages (6/15, 40%) and indiscriminately applying oxygen (9/15, 60%). The interview themes included psychological safety of the interface, ability to assess learning, and fidelity of cases. The most valuable feature cited was the ability to place orders in a realistic electronic medical record interface.CONCLUSIONS: This study demonstrates the feasibility and acceptability of a web-based platform for diagnostic assessment of specific EPAs. The approach rapidly identifies potential areas of concern for incoming interns using an asynchronous format, provides feedback in a manner appreciated by residency leadership, and informs individualized learning plans.

    View details for DOI 10.2196/32356

    View details for PubMedID 34787582

  • The Birth of a Return to Work Policy for New Resident Parents in Emergency Medicine. Academic emergency medicine : official journal of the Society for Academic Emergency Medicine Gordon, A. J., Sebok-Syer, S., Dohn, A. M., Smith-Coggins, R., Wang, N. E., Williams, S. R., Gisondi, M. A. 2019


    OBJECTIVE: With the rising number of female physicians, there will be more children than ever born in residency and the current system is inadequate to handle this increase in new resident parents. Residency is stressful and rigorous in isolation, let alone when pregnant or with a new child. Policies that ease these stressful transitions are generally either insufficient or do not exist. Therefore, we created a comprehensive Return to Work Policy for resident parents and piloted its implementation. Our policy aims to: 1) establish a clear, shared understanding of the regulatory and training requirements as they pertain to parental leave, 2) facilitate a smooth transition for new parents returning back to work, and 3) summarize the local and institutional resources available for both males and females during residency training.METHOD: In Fall 2017, a task force was convened to draft a Return to Work Policy for New Resident Parents. The task force included 9 key stakeholders (i.e., residents, faculty, and administration) at our institution and was made up of 3 Graduate Medical Education (GME) Program Directors, a Vice Chair of Education, a Designated Institutional Official (DIO), a Chief Resident, and 3 members of our academic department's Faculty Affairs Committee. The task force was selected because of individual expertise in gender equity issues, mentorship of resident parents, GME, and departmental administration.RESULTS: After development, the policy was piloted from November 2017 to June 2018. Our pilot implementation period included 7 new resident parents. All of these residents received schedules that met the return to work scheduling terms of our Return to Work Policy including no overnight shifts, no sick call, no more than 3 shifts in a row. Of equal importance, throughout our pilot, the emergency department schedules at all of our clinical sites remained fully staffed and our sick call pool was unaffected.CONCLUSION: Our Return to Work Policy for New Resident Parents provides a comprehensive guide to training requirements and family leave policies, an overview of available resources, and a scheduling framework that makes for a smooth transition back to clinical duties. This article is protected by copyright. All rights reserved.

    View details for PubMedID 30636353

  • Examining Differential Rater Functioning using a Between-Subgroup Outfit Approach Journal of Educational Measurement Wind, S. A., Sebok-Syer, S. S. 2019
  • You want me to assess what? Faculty perceptions of assessing residents from outside their specialty Academic Medicine Burm, S., Sebok-Syer, S. S., Lingard, L., VanHooren, T., Chahine, S., Goldszmidt, M., Watling, C. J. 2019
  • A Call to Investigate the Relationship Between Education and Health Outcomes Using Big Data ACADEMIC MEDICINE Chahine, S., Kulasegaram, K., Wright, S., Monteiro, S., Grierson, L. M., Barber, C., Sebok-Syer, S. S., McConnell, M., Yen, W., De Champlain, A., Touchie, C. 2018; 93 (6): 829–32


    There exists an assumption that improving medical education will improve patient care. While seemingly logical, this premise has rarely been investigated. In this Invited Commentary, the authors propose the use of big data to test this assumption. The authors present a few example research studies linking education and patient care outcomes and argue that using big data may more easily facilitate the process needed to investigate this assumption. The authors also propose that collaboration is needed to link educational and health care data. They then introduce a grassroots initiative, inclusive of universities in one Canadian province and national licensing organizations that are working together to collect, organize, link, and analyze big data to study the relationship between pedagogical approaches to medical training and patient care outcomes. While the authors acknowledge the possible challenges and issues associated with harnessing big data, they believe that the benefits supersede these. There is a need for medical education research to go beyond the outcomes of training to study practice and clinical outcomes as well. Without a coordinated effort to harness big data, policy makers, regulators, medical educators, and researchers are left with sometimes costly guesses and assumptions about what works and what does not. As the social, time, and financial investments in medical education continue to increase, it is imperative to understand the relationship between education and health outcomes.

    View details for DOI 10.1097/ACM.0000000000002217

    View details for Web of Science ID 000435369500022

    View details for PubMedID 29538109

  • Using electronic health record data to assess emergency medicine trainees independent and interdependent performance: a qualitative perspective on measuring what matters. Canadian Journal of Emergency Medicine Shepherd, L., Sebok-Syer, S. S., Lingard, L., McConnell, A., Sedran, R., Dukelow, A. 2018; 20

    View details for DOI 10.1017/cem.2018.336

  • Reliability and validity evidence for the quality of assessment for learning (QuAL) score Academic Emergency Medicine Chan, T. M., Sebok-Syer, S. S., Sampson, C., Monterio, S. 2018
  • Quality Evaluation Scores are no more Reliable than Gestalt in Evaluating the Quality of Emergency Medicine Blogs: A METRIQ Study TEACHING AND LEARNING IN MEDICINE Thoma, B., Sebok-Syer, S. S., Colmers-Gray, I., Sherbino, J., Ankel, F., Trueger, N., Grock, A., Siemens, M., Paddock, M., Purdy, E., Milne, W., Chan, T. M., METRIQ Study Collaborators 2018; 30 (3): 294–302


    Construct: We investigated the quality of emergency medicine (EM) blogs as educational resources.Online medical education resources such as blogs are increasingly used by EM trainees and clinicians. However, quality evaluations of these resources using gestalt are unreliable. We investigated the reliability of two previously derived quality evaluation instruments for blogs.Sixty English-language EM websites that published clinically oriented blog posts between January 1 and February 24, 2016, were identified. A random number generator selected 10 websites, and the 2 most recent clinically oriented blog posts from each site were evaluated using gestalt, the Academic Life in Emergency Medicine (ALiEM) Approved Instructional Resources (AIR) score, and the Medical Education Translational Resources: Impact and Quality (METRIQ-8) score, by a sample of medical students, EM residents, and EM attendings. Each rater evaluated all 20 blog posts with gestalt and 15 of the 20 blog posts with the ALiEM AIR and METRIQ-8 scores. Pearson's correlations were calculated between the average scores for each metric. Single-measure intraclass correlation coefficients (ICCs) evaluated the reliability of each instrument.Our study included 121 medical students, 88 EM residents, and 100 EM attendings who completed ratings. The average gestalt rating of each blog post correlated strongly with the average scores for ALiEM AIR (r = .94) and METRIQ-8 (r = .91). Single-measure ICCs were fair for gestalt (0.37, IQR 0.25-0.56), ALiEM AIR (0.41, IQR 0.29-0.60) and METRIQ-8 (0.40, IQR 0.28-0.59).The average scores of each blog post correlated strongly with gestalt ratings. However, neither ALiEM AIR nor METRIQ-8 showed higher reliability than gestalt. Improved reliability may be possible through rater training and instrument refinement.

    View details for DOI 10.1080/10401334.2017.1414609

    View details for Web of Science ID 000435016500007

    View details for PubMedID 29381099

  • Comparison of Simulation-based Resuscitation Performance Assessments With In-training Evaluation Reports in Emergency Medicine Residents: A Canadian Multicenter Study. AEM education and training Hall, A. K., Damon Dagnone, J., Moore, S., Woolfrey, K. G., Ross, J. A., McNeil, G., Hagel, C., Davison, C., Sebok-Syer, S. S. 2017; 1 (4): 293-300


    Simulation stands to serve an important role in modern competency-based programs of assessment in postgraduate medical education. Our objective was to compare the performance of individual emergency medicine (EM) residents in a simulation-based resuscitation objective structured clinical examination (OSCE) using the Queen's Simulation Assessment Tool (QSAT), with portfolio assessment of clinical encounters using a modified in-training evaluation report (ITER) to understand in greater detail the inferences that may be drawn from a simulation-based OSCE assessment.A prospective observational study was employed to explore the use of a multicenter simulation-based OSCE for evaluation of resuscitation competence. EM residents from five Canadian academic sites participated in the OSCE. Video-recorded performances were scored by blinded raters using the scenario-specific QSATs with domain-specific anchored scores (primary assessment, diagnostic actions, therapeutic actions, communication) and a global assessment score (GAS). Residents' portfolios were evaluated using a modified ITER subdivided by CanMEDS roles (medical expert, communicator, collaborator, leader, health advocate, scholar, and professional) and a GAS. Correlational and regression analyses were performed comparing components of each of the assessment methods.Portfolio review and ITER scoring was performed for 79 residents participating in the simulation-based OSCE. There was a significant positive correlation between total OSCE and ITER scores (r = 0.341). The strongest correlations were found between ITER medical expert score and each of the OSCE GAS (r = 0.420), communication (r = 0.443), and therapeutic action (r = 0.484) domains. ITER medical expert was a significant predictor of OSCE total (p = 0.002). OSCE therapeutic action was a significant predictor of ITER total (p = 0.02).Simulation-based resuscitation OSCEs and portfolio assessment captured by ITERs appear to measure differing aspects of competence, with weak to moderate correlation between those measures of conceptually similar constructs. In a program of competency-based assessment of EM residents, a simulation-based OSCE using the QSAT shows promise as a tool for assessing medical expert and communicator roles.

    View details for DOI 10.1002/aet2.10055

    View details for PubMedID 30051047

    View details for PubMedCentralID PMC6001706

  • Individual Gestalt Is Unreliable for the Evaluation of Quality in Medical Education Blogs: A METRIQ Study ANNALS OF EMERGENCY MEDICINE Thoma, B., Sebok-Syer, S. S., Krishnan, K., Siemens, M., Trueger, N., Colmers-Gray, I., Woods, R., Petrusa, E., Chan, T., METRIQ Study Collaborators 2017; 70 (3): 394–401


    Open educational resources such as blogs are increasingly used for medical education. Gestalt is generally the evaluation method used for these resources; however, little information has been published on it. We aim to evaluate the reliability of gestalt in the assessment of emergency medicine blogs.We identified 60 English-language emergency medicine Web sites that posted clinically oriented blogs between January 1, 2016, and February 24, 2016. Ten Web sites were selected with a random-number generator. Medical students, emergency medicine residents, and emergency medicine attending physicians evaluated the 2 most recent clinical blog posts from each site for quality, using a 7-point Likert scale. The mean gestalt scores of each blog post were compared between groups with Pearson's correlations. Single and average measure intraclass correlation coefficients were calculated within groups. A generalizability study evaluated variance within gestalt and a decision study calculated the number of raters required to reliably (>0.8) estimate quality.One hundred twenty-one medical students, 88 residents, and 100 attending physicians (93.6% of enrolled participants) evaluated all 20 blog posts. Single-measure intraclass correlation coefficients within groups were fair to poor (0.36 to 0.40). Average-measure intraclass correlation coefficients were more reliable (0.811 to 0.840). Mean gestalt ratings by attending physicians correlated strongly with those by medical students (r=0.92) and residents (r=0.99). The generalizability coefficient was 0.91 for the complete data set. The decision study found that 42 gestalt ratings were required to reliably evaluate quality (>0.8).The mean gestalt quality ratings of blog posts between medical students, residents, and attending physicians correlate strongly, but individual ratings are unreliable. With sufficient raters, mean gestalt ratings provide a community standard for assessment.

    View details for DOI 10.1016/j.annemergmed.2016.12.025

    View details for Web of Science ID 000410255300022

    View details for PubMedID 28262317

  • Hawks, Doves and Rasch decisions: Understanding the influence of different cycles of an OSCE on students' scores using Many Facet Rasch Modeling MEDICAL TEACHER Yeates, P., Sebok-Syer, S. S. 2017; 39 (1): 92–99


    OSCEs are commonly conducted in multiple cycles (different circuits, times, and locations), yet the potential for students' allocation to different OSCE cycles is rarely considered as a source of variance-perhaps in part because conventional psychometrics provide limited insight.We used Many Facet Rasch Modeling (MFRM) to estimate the influence of "examiner cohorts" (the combined influence of the examiners in the cycle to which each student was allocated) on students' scores within a fully nested multi-cycle OSCE.Observed average scores for examiners cycles varied by 8.6%, but model-adjusted estimates showed a smaller range of 4.4%. Most students' scores were only slightly altered by the model; the greatest score increase was 5.3%, and greatest score decrease was -3.6%, with 2 students passing who would have failed.Despite using 16 examiners per cycle, examiner variability did not completely counter-balance, resulting in an influence of OSCE cycles on students' scores. Assumptions were required for the MFRM analysis; innovative procedures to overcome these limitations and strengthen OSCEs are discussed.OSCE cycle allocation has the potential to exert a small but unfair influence on students' OSCE scores; these little-considered influences should challenge our assumptions and design of OSCEs.

    View details for DOI 10.1080/0142159X.2017.1248916

    View details for Web of Science ID 000393885800015

    View details for PubMedID 27897083

  • “It’s Complicated”: Understanding the Relationships Between Checklists, Rating Scales, and Written Comments in Workplace-Based Assessments Academic Medicine Sebok-Syer, S. S., Klinger, D. A., Sherbino, J., Chan, T. M. 2016; 91 (11)
  • Competency-based simulation assessment of resuscitation skills in emergency medicine postgraduate trainees – A Canadian multi-centered study Canadian Medical Education Journal Dagnone, J. D., Hall, A. K., Sebok-Syer, S. S., Klinger, D., Woolfrey, K., Davison, C., Ross, J., McNeil, G., Moore, S. 2016; 7 (1)
  • Seeing Things Differently or Seeing Different Things? Exploring Raters' Associations of Noncognitive Attributes ACADEMIC MEDICINE Sebok, S. S., Syer, M. D. 2015; 90 (11): S50–S55


    Raters represent a significant source of unexplained, and often undesired, variance in performance-based assessments. To better understand rater variance, this study investigated how various raters, observing the same performance, perceived relationships amongst different noncognitive attributes measured in performance assessments.Medical admissions data from a Multiple Mini-Interview (MMI) used at one Canadian medical school were collected and subsequently analyzed using the Many Facet Rasch Model (MFRM) and hierarchical clustering. This particular MMI consisted of eight stations. At each station a faculty member and an upper-year medical student rated applicants on various noncognitive attributes including communication, critical thinking, effectiveness, empathy, integrity, maturity, professionalism, and resolution.The Rasch analyses revealed differences between faculty and student raters across the eight different MMI stations. These analyses also identified that, at times, raters were unable to distinguish between the various noncognitive attributes. Hierarchical clustering highlighted differences in how faculty and student raters observed the various noncognitive attributes. Differences in how individual raters associated the various attributes within a station were also observed.The MFRM and hierarchical clustering helped to explain some of the variability associated with raters in a way that other measurement models are unable to capture. These findings highlight that differences in ratings may result from raters possessing different interpretations of an observed performance. This study has implications for developing more purposeful rater selection and rater profiling in performance-based assessments.

    View details for DOI 10.1097/ACM.0000000000000902

    View details for Web of Science ID 000375840200008

    View details for PubMedID 26505102

  • Examiners and content and site: Oh My! A national organization's investigation of score variation in large-scale performance assessments ADVANCES IN HEALTH SCIENCES EDUCATION Sebok, S. S., Roy, M., Klinger, D. A., De Champlain, A. F. 2015; 20 (3): 581–94


    Examiner effects and content specificity are two well known sources of construct irrelevant variance that present great challenges in performance-based assessments. National medical organizations that are responsible for large-scale performance based assessments experience an additional challenge as they are responsible for administering qualification examinations to physician candidates at several locations and institutions. This study explores the impact of site location as a source of score variation in a large-scale national assessment used to measure the readiness of internationally educated physician candidates for residency programs. Data from the Medical Council of Canada's National Assessment Collaboration were analyzed using Hierarchical Linear Modeling and Rasch Analyses. Consistent with previous research, problematic variance due to examiner effects and content specificity was found. Additionally, site location was also identified as a potential source of construct irrelevant variance in examination scores.

    View details for DOI 10.1007/s10459-014-9547-z

    View details for Web of Science ID 000357644900002

    View details for PubMedID 25164266

  • Cross-national trends in perceived school pressure by gender and age from 1994 to 2010 EUROPEAN JOURNAL OF PUBLIC HEALTH Klinger, D. A., Freeman, J. G., Bilz, L., Liiv, K., Ramelow, D., Sebok, S. S., Samdal, O., Duer, W., Rasmussen, M. 2015; 25: 51–56


    Pressure within school can be a critical component in understanding how the school experience influences young people's intellectual development, physical and mental health and future educational decisions.Data from five survey rounds (1993/1994, 1997/1998, 2001/2002, 2005/2006 and 2009/2010) were used to examine time-, age- and gender-related trends in the amounts of reported school pressure among 11-, 13- and 15-year-olds, in five different regions (North America, Great Britain, Eastern Europe, Nordic and Germanic countries).Across the regions the reported perceptions of school pressure did not change between 1994 and 2010, despite a temporary increase in 2002 and 2006. With the exception of children at 11 years of age, girls reported higher levels of school pressure than boys (Cohen's d from 0.12 to 0.58) and school pressure was higher in older age groups. These findings were consistent across countries. Regionally, children in North America reported the highest levels of school pressure, and students in the Germanic countries the lowest.Factors associated with child development and differences in societal expectations and structures, along with the possible, albeit, differential impact of the Programme for International Student Assessment (PISA), may partially explain the differences and trends found in school pressure. School pressure increases alongside the onset of adolescence and the shift from elementary school to the higher demanding expectations of secondary education. Time-related increases in school pressure occurred in the years following the release of the PISA results, and were larger in those regions in which results were less positive.

    View details for DOI 10.1093/eurpub/ckv027

    View details for Web of Science ID 000362971500013

    View details for PubMedID 25805788

  • Development and testing of an objective structured clinical exam (OSCE) to assess socio-cultural dimensions of patient safety competency BMJ QUALITY & SAFETY Ginsburg, L. R., Tregunno, D., Norton, P. G., Smee, S., de Vries, I., Sebok, S. S., VanDenKerkhof, E. G., Luctkar-Flude, M., Medves, J. 2015; 24 (3): 188–94


    Patient safety (PS) receives limited attention in health professional curricula. We developed and pilot tested four Objective Structured Clinical Examination (OSCE) stations intended to reflect socio-cultural dimensions in the Canadian Patient Safety Institute's Safety Competency Framework.18 third year undergraduate medical and nursing students at a Canadian University.OSCE cases were developed by faculty with clinical and PS expertise with assistance from expert facilitators from the Medical Council of Canada. Stations reflect domains in the Safety Competency Framework (ie, managing safety risks, culture of safety, communication). Stations were assessed by two clinical faculty members. Inter-rater reliability was examined using weighted κ values. Additional aspects of reliability and OSCE performance are reported.Assessors exhibited excellent agreement (weighted κ scores ranged from 0.74 to 0.82 for the four OSCE stations). Learners' scores varied across the four stations. Nursing students scored significantly lower (p<0.05) than medical students on three stations (nursing student mean scores=1.9, 1.9 and 2.7; medical student mean scores=2.8, 2.9 and 3.5 for stations 1, 2 and 3, respectively where 1=borderline unsatisfactory, 2=borderline satisfactory and 3=competence demonstrated). 7/18 students (39%) scored below 'borderline satisfactory' on one or more stations.Results show (1) four OSCE stations evaluating socio-cultural dimensions of PS achieved variation in scores and (2) performance on this OSCE can be evaluated with high reliability, suggesting a single assessor per station would be sufficient. Differences between nursing and medical student performance are interesting; however, it is unclear what factors explain these differences.

    View details for DOI 10.1136/bmjqs-2014-003277

    View details for Web of Science ID 000349721000005

    View details for PubMedID 25398630

    View details for PubMedCentralID PMC4345888

  • Examiners and content and site: Oh my! A national organization’s investigation of variation in performance-based assessments Advances in Health Sciences Education, Sebok-Syer, S. S., Roy, M., Klinger, D. A., De Champlain, A. F. 2015; 20 (3)
  • Psychometric properties of the multiple mini-interview used for medical admissions: findings from generalizability and Rasch analyses ADVANCES IN HEALTH SCIENCES EDUCATION Sebok, S. S., Luu, K., Klinger, D. A. 2014; 19 (1): 71–84


    The multiple mini-interview (MMI) has become an increasingly popular admissions method for selecting prospective students into professional programs (e.g., medical school). The MMI uses a series of short, labour intensive simulation stations and scenario interviews to more effectively assess applicants' non-cognitive qualities such as empathy, critical thinking, integrity, and communication. MMI data from 455 medical school applicants were analyzed using: (1) Generalizability Theory to estimate the generalizability of the MMI and identify sources of error; and (2) the Many-Facet Rasch Model, to identify misfitting examinees, items and raters. Consistent with previous research, our results support the reliability of MMI process. However, it appears that the non-cognitive qualities are not being measured as unique constructs across stations.

    View details for DOI 10.1007/s10459-013-9463-7

    View details for Web of Science ID 000331630200007

    View details for PubMedID 23709188

  • Assessment of a master of education counselling application selection process using rasch analysis and generalizability theory Canadian Journal of Counselling and Psychotherapy Sebok-Syer, S. S., MacMillan, P. D. 2014; 48 (2)
  • Understanding the complexities of validity using reflective practice REFLECTIVE PRACTICE Sebok, S. 2014; 15 (4): 445–55
  • Survey of northern informal and formal mental health practitioners INTERNATIONAL JOURNAL OF CIRCUMPOLAR HEALTH O'Neill, L., George, S., Sebok, S. 2013; 72: 135–41


    This survey is part of a multi-year research study on informal and formal mental health support in northern Canada involving the use of qualitative and quantitative data collection and analysis methods in an effort to better understand mental health in a northern context.The main objective of the 3-year study was to document the situation of formal and informal helpers in providing mental health support in isolated northern communities in northern British Columbia, northern Alberta, Yukon, Northwest Territories and Nunavut. The intent of developing a survey was to include more participants in the research and access those working in small communities who would be concerned regarding confidentiality and anonymity due to their high profile within smaller populations.Based on the in-depth interviews from the qualitative phase of the project, the research team developed a survey that reflected the main themes found in the initial qualitative analysis. The on-line survey consisted of 26 questions, looking at basic demographic information and presenting lists of possible challenges, supports and client mental health issues for participants to prioritise.Thirty-two participants identified various challenges, supports and client issues relevant to their mental health support work. A vast majority of the respondents felt prepared for northern practice and had some level of formal education. Supports for longevity included team collaboration, knowledgeable supervisors, managers, leaders and more opportunities for formal education, specific training and continuity of care to support clients.For northern-based research in small communities, the development of a survey allowed more participants to join the larger study in a way that protected their identity and confidentiality. The results from the survey emphasise the need for team collaboration, interdisciplinary practice and working with community strengths as a way to sustain mental health support workers in the North.

    View details for DOI 10.3402/ijch.v72i0.20962

    View details for Web of Science ID 000325721900021

    View details for PubMedID 23984276

    View details for PubMedCentralID PMC3753122

  • Reflections on using data visualization techniques to engage stakeholders Queen’s University Graduate Student Symposium Proceedings Lam, C. Y., Ma, J., Sebok, S., Chapman, A. E., Mei, Y. 2012