NBOME mail

It was an historic Match Day for DO students and graduates. This year’s NRMP Main Residency Match was the largest on record, with over 40,000 applicants applying to positions in 5,048 programs across the US. Applicants in this year’s Match increased by 4.5% over last year to 37,256 available positions. On March 20th, 6,215 4th year osteopathic medical students and graduates who matched to PGY-1 positions learned where they’ll be spending the next three or more years in residency training programs in the specialty in which they’ll work.

NRMP President and CEO, Donna Lamb, DHSc, MBA, BSN, reported, “We are especially excited that the 2020 Match marks a milestone for the medical education community: The first Single Match for U.S. MD and DO senior students and graduates and the inclusion of DO senior students as sponsored applicants.”


The results of this year’s Match, the first Single Match under the single accreditation system, continue to trend positively for osteopathic graduates entering residency training. Despite being socially distanced and self-quarantined due to the COVID 19 pandemic, students, schools, programs, specialty societies and others held remote “Virtual Match Day” celebrations in observance of the highest number of DO seniors who matched to PGY-1 positions—5,968—at a rate of 90.7%, compared to a match rate of 80.8% for all applicant types. Applicants who did not match to a residency position participated in the Match week Supplemental Offer and Acceptance Program (SOAP) to obtain one of 1,897 positions available. More DO applicants matched during SOAP, with data to be released in early May.

DO applicants in this year’s Match accounted for the largest increase in applicant groups — up by 19.2%, (1,153 applicants) and resulting in a 20.1% increase in DO seniors (up 1,103) compared to last year. Increased match rates for DO seniors ‒ up 2.6% from 2019 and up 13% from 2016 ‒ are especially impressive considering the increase number of applicants. The match rate for all osteopathic applicants (seniors plus prior graduates) also rose to 86.9%, up from 84.6% in 2019. Positive results by specialty include the 3 most popular specialty matches for DO senior applicants: Internal Medicine, Family Medicine, and Emergency Medicine. Orthopedic Surgery and General Surgery results for DO seniors also showed significant increases.

NBOME President and CEO, John R. Gimpel, DO, MEd shared his congratulations with students, COMs, AACOM, the AOA and many others who contributed to the success of this year’s single Match, “This year’s Match highlights the opportunity for the betterment of quality care and clinical learning involvement in our nation’s GME programs and the patients and communities we have the privilege to serve.” Dr. Gimpel remarked about the equivalent manner in which US DO and MD seniors and graduates are included in the NRMP Match reports, infographics and other communications. For more on the Match 2020 results, follow NBOME social media or read DO student Match stories here. Additional details about this year’s NRMP outcomes, can be accessed in the NRMP 2020 Advance Data Tables here.

With a single GME system comes expanded training opportunities for DO and MD applicants—and many are anxiously awaiting this year’s 2020 NRMP® Main Match, which is fast approaching. DO applicants have done well in the Match, and match rates for DO applicants have continued to increase throughout the transition to single GME. Still, there are many myths and misconceptions that surround Match 2020, single GME, and the use of COMLEX-USA scores for DO applicants. Here are a few of them—and the facts:

ACGME residency programs don’t accept COMLEX-USA.
The ACGME does not require one licensing exam over another—passing either COMLEX-USA or USMLE meets that requirement. Eligibility for appointment into residency programs accredited by the ACGME requires applicants to be graduates of AOA or LCME-accredited medical schools. Bottom line: programs can accept both USMLE and COMLEX-USA scores. Single GME does not exclude DO applicants with COMLEX-USA scores from applying.

A program I want to apply to doesn’t accept COMLEX-USA.
Most residency programs accept COMLEX-USA for application to their programs. In fact, in specialties preferred by DO applicants, 82% of program directors surveyed by the NRMP say they use COMLEX-USA Level 1 scores to consider applicants for interviews. In some specialties, it’s even higher. Historically, Match rates for DO students have been really high, and the overall match rate for DOs has been close to 99%, in fact. Results for DO applicants in the NRMP Main Match have continually increased over time.

Hear what Ken Simons, MD, Senior Associate Dean for GME and Accreditation, the Designated Institutional Official at the Medical College of Wisconsin has to say about this issue:

I heard about a program that accepts COMLEX-USA, but the scores are hard to understand so they don’t use them to select applicants to interview.
Residency programs are great at a lot of things, but some programs may not be as familiar with COMLEX-USA scores or are simply misinformed. COMLEX-USA score reports provide valuable information on performance and are easy to understand. Program directors are provided convenient access to a COMLEX-USA percentile score converter through the ERAS platform and the NBOME website. NBOME provides resources and research to program directors on the predictive validity of the exam and how it can be used as part of a holistic review of candidates.

I heard if you do an ACGME residency, you have to take USMLE to get a license.
Not true. COMLEX-USA is accepted in all U.S. states (and in some international jurisdictions) for licensure for DOs. The Federation of State Medical Boards, the FSMB, accepts COMLEX-USA as valid. According to the FSMB, there is no requirement for DOs to take USMLE in order to obtain a license, in any state.

DOs need to take USMLE in addition to COMLEX-USA because its a requirement for certification by ABMS boards.
DOs and MDs completing ACGME-accredited residency programs can choose to become board certified by the AOA and/or the ABMS. DOs who train in ACGME-accredited residency programs, and who hold a current and unrestricted medical license are eligible to sit for AOA and/or ABMS certifying boards. The American Osteopathic Association has great information board certification through AOA boards–find out more here.

Match Resources | Myths & Misconceptions

You may also like

Myths and Misconceptions – COMLEX-USA Level 2-PE

January 30, 2020
We’re guessing you don’t necessarily believe everything you hear or read these days, especially on the internet....

It’s Match week and everybody is invited.  Share your congratulatory messages, photos and videos here!

Real-time information about test center status.

Information about COVID-19 for travelers and travel related industries.

General information and situation updates from the Centers for Disease Control and Prevention.

Great resources for Match Week, SOAP applicants, and others.

If you don’t match, your chances of landing a residency position are still high.


We all strive to make #WorkLifeBalance part of our daily practice, but is it? With four years of undergrad, four years of med school, and a whole lot more years of residency, hours spent studying are what fill all the cracks between classes, clinical rotations, and the like. Studying becomes your personal mantra—your bread-and-butter—your life-force. You may feel duty-bound, mired by the responsibilities and expectations tied to your goal to become a DO. And as these responsibilities ramp up, there is the very real potential that the time and energy you once had for your passions and hobbies may be at an all-time low.
We all strive to make #WorkLifeBalance part of our daily practice, but is it? With four years of undergrad, four years of med school, and a whole lot more years of residency, hours spent studying are what fill all the cracks between classes, clinical rotations, and the like. Studying becomes your personal mantra—your bread-and-butter—your life-force. You may feel duty-bound, mired by the responsibilities and expectations tied to your goal to become a DO. And as these responsibilities ramp up, there is the very real potential that the time and energy you once had for your passions and hobbies may be at an all-time low.


We all strive to make #WorkLifeBalance part of our daily practice, but is it? With four years of undergrad, four years of med school, and a whole lot more years of residency, hours spent studying are what fill all the cracks between classes, clinical rotations, and the like. Studying becomes your personal mantra—your bread-and-butter—your life-force. You may feel duty-bound, mired by the responsibilities and expectations tied to your goal to become a DO.  And as these responsibilities ramp up, there is the very real potential that the time and energy you once had for your passions and hobbies may be at an all-time low.

The brain-sweat, tears, and hard work that are vital on your journey to DO licensure does not need to evoke feelings of guilt for not studying every hour of the day. So let’s tackle the study burnout and help you find ways to carve out time to keep doing what you love.  Here’s how:

Schedule it

Perhaps most of your life is meticulously mapped on a calendar, moment by moment—classes, clinical rotations, crazy amounts of studying, eating, sleeping, laundry, brushing your teeth, more studying. But your hobbies and passions are just as important to your well-being and your mental headspace, and they deserve dedicated time on your calendar too. Yes, yes, we know, there are only so many hours in a day, but who says your passions need to take hours? Take a 10-minute run (instead of your normal 10-mile jog), cook a 15-minute meal (instead of a multi-course masterpiece), read one chapter of a fiction book (instead of the whole thing), doodle for 10 minutes…you get the idea. Your hobbies and passions, just in smaller doses. Clear that brilliant brain of yours, nourish yourself with things that bring you joy, and then get back to the books!

Stop canceling

You’ve made the appointment, but are you going to show up?  This is the tough part because we all know how easy it is to cancel, especially when you are too tired, too busy, and too guilty (when you think you should be studying even more than you are). Showing up for yourself amidst COMLEX-USA exam prep might seem like a joke, but doing what you love is a big part of recharging and fueling what comes next. Be confident in the schedule you create and the choices you make, more importantly when they involve taking care of you.  If you don’t do it, no one else is going to do it for you.

Mix it up.

It’s easy to shelve the hobbies you’ve had for a long time. While they still bring you joy, they don’t give you the same butterflies-in-your-stomach feeling they did years ago.  Maybe it’s time to try something new: rock climbing, origami, ballroom dancing. Step out of your comfort zone and make use of an entirely different part of your brain or body. Getting out of your study-cave and into something new might also help ignite some of those old passions, clear your mind, and help energize you so you can get back to the studying. Plus, adding to your arsenal of interests may eventually help you connect and empathize better with your patients as well.

When studying starts to bleed into the things you love, take a step back and remind yourself that you’re a person before you are a physician—and even more important: you’re a person even after you become one. During this month of passion and #DOLove, encourage yourself, empower yourself, and break through your schedule to keep doing the things that bring you joy and keep you, you.

See All

You may also like

Discussing Mental Health Resources with Psychiatrist Ryan Smith, DO, PhD, MBA, MSEd

May is Mental Health Awareness Month, and for many medical students and residents, their mental health can often...

The Value & Impact of Social Media on Mental Health

There are as many opinions about social media as there are on social media. They range from viewing these virtual...

Live True to Your Body, Mind & Spirit - Interview with Merwan Faraj, ENS, MC, USNR, OMS-II

April 26, 2021
Merwan Faraj, ENS, MC, USNR, is a second year med student at Edward Via College of Osteopathic Medicine-Carolinas...


COMLEX-USA Item Reduction

The NBOME Board of Directors approved plans to shorten COMLEX-USA Level 2-CE from 400 to 352 test items, beginning June 2020. This will only affect pre-test content, so exam validity/content coverage or reliability will not be impacted. This change should also reduce some of the stress associated with the time-pressured environment. Available test time will remain at 8 hours. Similar changes have been approved for COMLEX-USA Level 1 beginning May 2021.

Pretest questions are embedded into COMLEX-USA exams but do not count towards candidate’s scores. They provide useful information on the quality and relative difficulty of the questions to ensure fairness for candidates. They help obtain item statistics (for quality control as well as equating purposes), and to further test novel item formats.  The number of total items per block will decrease from 50 to 44 items.



Further Review of COMLEX-USA Exam Scoring and Score-Reporting

NBOME has continued to study the uses of COMLEX-USA scores and score-reporting as it relates to the primary and intended purposes of the examinations (i.e., licensure), as well as secondary uses (most cited one is in residency program applications.) Our Board of Directors has approved the continued use of pass-fail only for the COMLEX-USA Level 2-PE clinical skills exam, and recommended further study related to the use of pass-fail as well as numerical scoring for COMLEX-USA Level 1, Level 2-CE and Level 3. Further updates will be provided as early as July 2020.



Modification to COMLEX-USA Level 1 and Level 2-PE Test Cycles in 2020

In response to feedback from candidates and deans, NBOME has modified the 2020-2021 test cycle for COMLEX-USA Level 1.  It now commences 3 weeks earlier than prior years, running May 5, 2020 through April 2021.

We have also adjusted the COMLEX-USA Level 2-PE test cycle in response to increased demand during the spring months.  The 2020 test cycle will now end in early November 2020, with a new complete test cycle beginning November 30, 2020. This change should provide additional testing opportunities in times of higher demand, thus helping candidates and schools to better facilitate the residency program application process.



Prometric Test Center and COMLEX-USA Enhancements for New Test Cycles in 2020

To assure an optimal computer-based testing experience at Prometric Test Centers, modifications continue to be made for COMLEX-USA examinations. Effective with the new May 2020 test cycle for COMLEX-USA Level 1, and the new June 2020 test cycle for Level 2-CE, the Prometric test driver used for all COMLEX-USA examinations has been updated to that already being used in COMLEX-USA Level 3. We endeavor to provide an optimal testing experience for all COMLEX-USA candidates and feel confident that these changes will further enhance the COMLEX-USA program.



NBOME to Modify Attempt Limits for COMLEX-USA Effective July 1, 2022

The NBOME Board of Directors approved changes to eligibility for COMLEX-USA to limit the maximum number of attempts to 4 total per exam, effective July 1, 2022. This change is intended to minimize misclassification, enhance test security/integrity, and reinforce NBOME’s mission to protect the public. Exceptions petitioned by a state medical or osteopathic medical licensing board will be evaluated on a case-by-case basis. Further information will be outlined in the COMLEX-USA Bulletin of Information, planned for release in July 2020.



For more information, please contact NBOME Client Services at clientservices@nbome.org or 866.479.6828



The NBOME is pleased to recognize the 2019 Item Writer and Case Author of the Year award winners from its distinguished National Faculty. Throughout the year, this group of individuals graciously volunteered their time and expertise to contribute to the COMLEX-USA and COMAT exam programs. In addition to their professional roles, these volunteers wear a variety of hats – writing and reviewing test items, serving as physician examiners for COMLEX-USA Level 2-PE, and supporting the NBOME mission to protect the public through competency assessment.

Each year, the NBOME Board selects the best-in-class item writers and case authors from a large group of contributors. Congratulations to these esteemed awardees for their exemplary commitment to producing valid and high quality exam content.


2019 COMLEX-USA Level 1 Item Writer of the Year: Martin Schmidt, PhD

Jonathan D. Auten, DODr. Schmidt is a professor of biochemistry at DMU-COM in Des Moines, Iowa and a long-standing member of our National Faculty. His contributions have been to the COMLEX-USA Level 1 and COMAT Foundational Biomedical Examinations.





2019 COMLEX-USA Level 2-CE Item Writer of the Year: John Dougherty, DO

Brett S. Stecker, DODr. Dougherty is the Founding Dean and Chief Academic Officer at Noorda College of Osteopathic Medicine (proposed) in Provo, Utah. He has been a member of the National Faculty since 2016.





2019 COMLEX-USA Level 2-PE Case Author of the Year: Robyn Dreibelbis, DO

Maurice W. Oelklaus, DO

Dr. Dreibelbis is vice-chair and assistant professor of Family Medicine at WesternU/COMP – Northwest in Lebanon, Oregon. Dr. Dreibelbis has been selected for this award as a member of the Case Development Committee.





2019 COMLEX-USA Level 3 Item Writer of the Year: Binh Phung, DO, MHA

Megan Krease, DO

Dr. Phung is a clinical assistant professor of Pediatrics at OSU-COM in Tulsa, Oklahoma and a pediatric hospitalist at the Children’s Hospital at St. Francis. Dr. Phung has focused his talents on the COMLEX-USA Level 3 examination in both multiple-choice questions and clinical decision-making content.




2019 Clinical Decision-Making (CDM) Case Writer of the Year: Brett Stecker, DO

Teresa M. Kilgore, DODr. Stecker is the assistant professor at Alpert Medical School at Brown University and physician advisor at Steward Medical Group at Morton Hospital in South Easton, Massachusetts. Dr. Stecker is experienced in working with the COMLEX-USA Levels 1, 2-CE and 3 examinations, and was previously awarded Item Writer of the Year for COMLEX-USA Level 2-CE in both 2016 and 2018.




2019 COMLEX-USA Osteopathic Principles and Practice (OPP) Item Writer of the Year: Lauren Noto Bell, DO

Jason T. Eberl, PhD

Dr. Noto Bell is an associate professor at PCOM in Philadelphia, Pennsylvania and a long-standing member of the National Faculty. She has been involved with all levels of the COMLEX-USA examinations and the COMAT clinical exam. She was awarded item writer of the year for OPP in 2017.




2019 COMLEX-USA Preventative Medicine/Health Promotion (PMHP) Item Writer of the Year: Todd Coffey, PhD

Jason T. Eberl, PhD

Dr. Coffey is chair and associate professor in the department of research and biostatistics at ICOM in Meridian, Idaho. He joined the National Faculty in 2018 and contributes to all levels of the COMLEX-USA level examinations.





2019 COMAT Clinical Item Writer of the Year: Jessica Rogers, DO

Katherine A. Mitzel, DODr. Rogers is an Obstetrician and Gynecologist at Coyle Institute Female Pelvic Medicine & Reconstructive Surgery in Pensacola, Florida. She joined the National Faculty in 2014 and has been a significant contributor to both the COMLEX-USA and COMAT examinations.





2019 COMAT Foundational Biomedical Sciences (FBS) Item Writer of the Year: Lori Redmond, PhD

Rebecca L. Pratt, PhDDr. Redmond is a professor of Neuroscience at PCOM in Suwanee, Georgia and has been a member of our National Faculty since 2017. She was recruited for and has been a strong contributor to the COMAT Foundational Biomedical Sciences examinations.

NBOME recently sat down with Sandra Waters, MEM, Vice President for Collaborative Assessment & Initiatives, to learn more about the upcoming release of CATALYST, a new longitudinal assessment platform that will initially house COMSAE Phase 2 content when it launches this spring.




NBOME: Your team is debuting a new product this spring—COMSAE Phase 2 on CATALYST.  We’re already familiar with COMSAE, but what exactly is CATALYST?

Sandra Waters: CATALYST is a longitudinal assessment platform designed to enhance learning.  So, it isn’t actually content, it’s a new mechanism to deliver content to users.

NBOME: How is longitudinal assessment different from the more traditional learning approaches we’re used to? 

SW: The notion of assessing someone over time—that really is the key. In a traditional class, individuals learn about a subject for a period of time, and then they learn about another subject, and then another subject. Longitudinal assessment uses something called topics interleaving, which enables an individual to gain exposure to ALL of those different components at one time.  It creates these bursts of learning and knowledge acquisition.

If an individual is performing well in a certain area, they don’t need to be assessed nearly as frequently in that area.  In areas where an individual isn’t performing well, CATALYST can increase the volume and frequency of content related to that trouble spot. The intent is to fine-tune the learning component, make it more targeted, and use that to increase knowledge and skills.

CATALYST was developed to combine learning with assessment. The assessments NBOME normally conducts are taken at a single point in time. Whereas, the CATALYST platform enables an individual to assess their knowledge and skills over an extended period of time.  And it aids learning by providing users with immediate feedback while the material is still front-of-mind. Research has shown us that this is a much more effective way for an individual to learn—as opposed to sitting down, taking a test, and never truly understanding what you got wrong—or why.

NBOME: How did this all come about?  What’s CATALYST’s origin story?

SW: When we first had the idea for CATALYST, we focused our efforts on designing the technology for board re-certification.  The current approach involves a physician coming to a testing center every 6-10 years to take a closed book exam for 6-8 hours—not exactly the easiest feat when you’re running a full-time practice, seeing patients, and fulfilling all of the other responsibilities of a busy physician. CATALYST has the ability to change the whole playing field.

However, as we were developing and testing the platform, we continued to identify other ways to use it and other content we could put on there, including our own COMSAE content.

NBOME: Tell me more about why you decided to launch using COMSAE content.

SW: When we were developing CATALYST, we decided to pilot COMSAE Phase 2 on the platform.  It just presented itself as an easy entry point for developing the added features that make CATALYST so special.

Each question includes a rationale essentially explaining why the correct answer is correct and why the incorrect answer is not correct. CATALYST also provides references for further learning and understanding. It’s a self-contained way to test knowledge and skills while providing additional information.

Further, COMSAE on CATALYST is built for busy schedules and maximum flexibility. It’s designed to feed questions to users at self-selected intervals. For example, you could opt to receive 10 questions each week or 30 questions all on one day. As we discussed before, everyone learns a little differently, and we all have different needs and schedules. This platform helps speak to that.

NBOME: With such strong focus on mobility and digitally nimble technology these days, what is the roll-out plan for COMSAE on CATALYST?

SW: COMSAE and other products offered on the CATALYST platform will be available on all devices, and also include a mobile app.  Flexibility and convenience were extremely important to us as we developed the product.

NBOME: Who is eligible to purchase COMSAE Phase 2 on CATALYST and how does the system work?  Can you walk me through the user experience?

SW: It is available to anyone who has an account with NBOME. Candidates may purchase the product through NBOME’s secure portal, at which point, they’ll be sent a welcome email along with login credentials to access the CATALYST platform. From here, they can customize the frequency of questions to suit their unique needs and learning goals. Based on those settings, they will begin to receive notifications when questions are available.

NBOME: If I was a student considering COMSAE on CATALYST, why would I want this over the traditional COMSAE format? 

SW: I actually think you’d want both.  The traditional COMSAE allows you get game-day-ready in an environment that closely mimics COMLEX-USA. Questions are formatted to match in style, and it’s a timed administration—just like COMLEX-USA. You also receive a final report once you complete the assessment. COMSAE on CATALYST is much more of a learning tool. It focuses on mastery of the content. You receive question-by-question and performance-by-domain feedback, rather than just a final report.

Because they serve completely different purposes, I wouldn’t necessarily see one replacing the other.

NBOME: What future enhancements can we look forward to with the CATALYST platform and longitudinal learning?

SW: We’re working on plans to expand our content offerings on the platform. COMSAE Phase 1 is being considered as an option, as well as COMAT subjects. Stay tuned!

See All

You may also like

Osteopathic International Alliance (OIA) Conference

From October 4-6, 2019, NBOME Board Vice-Chair Geraldine O’Shea, DO, and President and CEO, Dr. John R. Gimpel, attended the annual Osteopathic International Alliance Conference in Bad Nauheim, Germany.

The OIA is in official relations status with the World Health Organization, and “envisions a world in which every person has access to high-quality osteopathic medicine.” Next year’s AGM and conference will be held in Rio de Janeiro, Brazil from September 30-October 2, 2020.

National Resident Matching Program® (NRMP®) Conference

From October 3-5, 2019, NBOME Associate Vice President for Strategy and Quality Initiatives, Melissa Turner, MS, attended the National Resident Matching Program® (NRMP®) Conference in Chicago. The NBOME provided attendees with a COMLEX-USA update.

This year’s stakeholder conference, titled, “Transition to Residency: Conversations Across the Medical Education Continuum,” set a record for its 300 registrants. Focusing on a variety of topics related to residency, speakers included Ezekiel Emanuel, MD, PhD, Helen Fisher, PhD and Lawrence G. Smith, MD, MACP.

The COMLEX-USA Composite Examination Committee (CCEC) Meeting

The COMLEX-USA Composite Examination Committee (CCEC) met on October 14 and 15 in the Philadelphia Executive offices. This committee reviews all levels of the COMLEX-USA examination series, including statistics and candidate feedback, and provides a report to the NBOME Board of Directors. At this meeting, CCEC reviewed performance and innovations happening within the examination series — including the potential of reducing test items in Level 1 and 2-CE, as well as the Level 2-PE team researching possible modifications to the Humanistic and Biomedical/Biomechanical Domains. CCEC convenes the Blueprint Subcommittee, which regularly reviews the COMLEX-USA Master Blueprint to assure it keeps up with the evolving practice of osteopathic medicine.

The committee also discussed hot topics related to licensure examinations, such as the possibility of switching to a pass/fail scoring system, or keeping some form of numeric-based scoring. The CCEC is also reviewing the current maximum number of attempt limits per examination level. The Point-of-Care Knowledge, Education and Testing (POCKET) process was reviewed and decisions were recommended regarding next steps with this process.

Osteopathic Medical Education Conference (OMED)

From October 25-28, 2019, the NBOME participated in the Osteopathic Medical Education Conference (OMED) in Baltimore, Maryland. The American Osteopathic Association’s (AOA) annual conference brought together thousands of osteopathic physicians, medical students, and other health professionals from across the country for medical education, inspiration, networking and entertainment.

NBOME exhibited at the conference, featuring our new COMAT-Foundational Biomedical Sciences portfolio, as well as the COMLEX-USA examination series, the CATALYST platform and opportunities for doctors to explore the NBOME National Faculty Program. Attendees visiting the NBOME booth were greeted by staff who had meaningful conversations with many visitors and were on hand to answer student, faculty, practicing physician and others’ questions.

On day one of the conference, the American Osteopathic Foundation hosted its annual Honors Gala, presenting awards to a number of NBOME National Faculty members, including AOF Educator of the Year to Richard Jermyn, DO, from RowanSOM. In addition, and in honor of NBOME’s 85th anniversary year, the NBOME made a contribution to the William Anderson, DO Minority Scholarship Fund.

Early Sunday morning, a team of NBOME runners and walkers came out to join in the Advocates for the American Osteopathic Association (AAOA) Fit for Life Run 2019. The run benefitted osteopathic student scholarships and the NBOME was a featured sponsor as well.

Association of American Medical Colleges (AAMC) Annual Conference

From October 8-12, NBOME sent John R. Gimpel, DO, MEd to attend the AAMC Learn, Serve, Lead 2019 Conference in Phoenix, AZ. The meeting covered many of the successes and challenges of academic medicine nationwide, and was attended by medical educators from across the country. Retired President and CEO of AACOM, Steve Shannon, DO, received an achievement award.

Council of Medical Specialty Societies (CMSS) & Organization of Program Directors Associations (OPDA) Annual Meeting

From November 21-22, 2019, NBOME Vice President for Collaborative Assessment & Initiatives Sandra Waters, MEM and Associate Vice President for Strategic & Quality Initiatives, Melissa Turner, MS, attended the Council of Medical Specialty Societies (CMSS) Annual Meeting and Specialty Forum in Arlington, VA.

Together, they presented NBOME and COMLEX-USA updates at The Organization of Program Directors Associations (OPDA) meeting. “OPDA is dedicated to promoting the role of the residency program director and program director societies in achieving excellence in graduate medical education.”

The NBOME Test Accommodations Committee (TAC) Meeting

Between November 21-22, The NBOME Test Accommodations Committee, which is comprised of osteopathic physicians and other subject matter experts who review applications for special accommodations from COMLEX-USA candidates in cooperation with NBOME staff.

The Committee met to discuss trends and developments in the test accommodations realm, as they apply to high-stakes testing agencies like the NBOME.


Coming Up

In the next quarter, we’ll be making appearances at the following conferences and meetings:

PHILADELPHIA, PA. The National Board of Osteopathic Medical Examiners (NBOME), an independent, not-for-profit organization that provides competency assessments for osteopathic medical licensure and related health care professions, announced Lori Kemper, D.O., M.S., FACOFP, as its newest Secretary-Treasurer. At their December Board of Directors Meetings, the NBOME elected Dr. Kemper to a two year term.

“I’m thrilled to have even been considered, let alone chosen, as the NBOME’s next Secretary-Treasurer,” said NBOME board member, Lori Kemper, D.O., M.S., FACOFP. “I look forward to working towards our mission in this new, exciting officer role.”

Dr. Lori Kemper’s more than 30-year career encompasses both independent practice and graduate medical education. Since 2007, she has been the dean of Midwestern University, Arizona College of Osteopathic Medicine, where she previously served as associate dean of graduate medical education and associate professor in the department of family medicine. She currently serves as a commissioner to the American Osteopathic Association (AOA) Commission on Osteopathic College Accreditation (COCA) and is the chair of the Board of Deans of the American Association of Colleges of Osteopathic Medicine (AACOM). Dr. Kemper currently serves the NBOME as a member of the Test Accommodations Committee and the Awards Committee.

“We’re ecstatic to announce Dr. Kemper’s election to Secretary-Treasurer of the NBOME,” said NBOME Board Chair Geraldine O’Shea, DO. “Her tenure on our Board of Directors has resulted in great strides for us as an organization, and we look forward to what she’ll help us accomplish moving forward.”

PHILADELPHIA, PA. The National Board of Osteopathic Medical Examiners (NBOME), an independent, not-for-profit organization that provides competency assessments for osteopathic medical licensure and related health care professions, today introduced Juan F. Acosta, DO, MS, as its newest board member. He was recommended to the NBOME Nominating Committee via the Assembly of Graduate Medical Educators (AOGME, formerly known as the Association of Osteopathic Directors of Medical Education- AODME), filling the seat previously held by new NBOME Vice-Chair Richard J. LaBaere II, DO, MPH. He was elected at the annual NBOME Board Meeting in December.

“Joining the Board of Directors at the NBOME is an exciting opportunity for me,” said NBOME new board member, Juan F. Acosta, DO, MS. “I’d like to express my gratitude to Dr. Gimpel and his colleagues at the NBOME Board for electing me to this position.”

Dr. Acosta recently moved to New York where he serves as the Associate Medical Director for the Emergency Department at Saint Catherine of Siena Medical Center in Smithtown. He is also actively involved with the Disaster Medical Assistance Team (DMAT) and serves as a reviewer for the Journal of Emergency Medicine and a section editor for the West-JEM Journal. Dr. Acosta is an Oral Board examiner for American Osteopathic Board of Osteopathic Emergency Medicine (AOBEM). He is presently the secretary for the American College of Osteopathic Emergency Physicians (ACOEP) and Secretary for the Association of Osteopathic Directors and Medical Educators (AODME). Dr. Acosta also serves on the American Association’s Commission on Osteopathic College Accreditation (COCA) and the Committee on Continuing Medical Education (CCME).

“We are very fortunate to welcome Dr. Acosta to the Board,” said NBOME Board Chair Geri O’Shea, DO.  “His enthusiasm, decorated professional career and experience in graduate medical education, add to our Board at an exciting time for the NBOME and the osteopathic medical profession.”

COMAT Product Updates

The COMAT exam series will expand in January 2020 to include the new Foundational Biomedical Sciences (FBS) Targeted exams. Each of the 14 exams focus on a specific organ body system or basic science discipline introduced to osteopathic medical students in years one and two. Click here to see the full list of available FBS Targeted subject exams.

Since the introduction of the FBS Comprehensive exam in December 2018, a total of 20 Colleges of Osteopathic Medicine have used or plan to use the FBS exams for pre-clerkship assessments to complement their use of the COMAT Clinical discipline exams for their end-of-rotation needs in years three and four.

The October issue of the Journal of Graduate Medical Education included research on the concurrent and predictive validity of the COMAT discipline exams and COMLEX-USA Level 2-CE. The findings indicated statistically significant, positive associations between COMAT and COMLEX-USA Level 2-CE scores, which can support the use of COMAT for osteopathic medical schools.


COMSAE Product Updates

After the first several score releases for COMLEX-USA Level 1 beginning mid-July 2019 and Level 2-CE beginning mid-August 2019, the NBOME conducted a comprehensive evaluation of scores on COMSAE Phase 1 and Phase 2 and subsequent scores on COMLEX-USA Level 1 and Level 2-CE. Following this evaluation, both the COMSAE Phase 1 and Phase 2 new score reports and scoring will be implemented on February 3, 2020. Please note that the COMSAE Phase 2 cut-score remains the same as 2019.

In addition, the NBOME has conducted a concordance correlation study, which demonstrated a positive and significant correlation, around 0.70, with COMSAE Phase 1 and COMLEX-USA Level 1 and COMSAE Phase 2 and COMLEX-USA Level 2-CE. This concordance study finding is consistent with those of recent years for all forms purchased by COMs with timed administrations.

As always, caution should be exercised when using COMSAE scores to estimate subsequent COMLEX-USA scores or for uses other than those for which they were developed.

We will continue to communicate regularly with COMs regarding new information related to COMSAE scores and their relationship to COMLEX-USA scores, as well as other COMSAE program updates as they become available.

“When you show deep empathy toward others, their defensive energy goes down, and positive energy replaces it. That’s when you can get more creative in solving problems.” – Stephen Covey

Empathy has always been the root of human connection, and in that, stems the foundation of our capacity to help others—whether family, friend, or patient—it all comes down to the same core values. And yet, it is the humanistic domain that many question including as part of the DO licensure exam. How important is it?

Having worked in osteopathic medical education and licensing for over 25 years, I am frequently posed the question, what makes DOs different? My answer is always the same, it’s about patient empathy. This isn’t to say that MDs don’t possess this trait; they do. However, there’s a heightened sense of empathy and patient understanding that seems to steer certain candidates toward osteopathic medicine.

The DO approach is based on the unique connection between mind, body, and spirit as it relates to patient care. It’s this holistic, 360-degree assessment and desire for enhanced understanding that fuels empathy and a different shade of patient care. It also involves empathic inquiry, developing understanding beyond just the problem at hand, but also what other life factors are impacting the patient. As a doctor, understanding how these many dimensions interact and intersect on a deeper level is the basis of the DO approach.

To clarify, empathic doctors are not internalizing or ‘taking on’ a patient’s pain or discomfort in a therapeutic way. Rather, they’re attempting to understand the patient’s illness experience. A patient once told me, “I don’t need my doctor to love me, but I do need them to understand me.” That deeper level of understanding is what brings humanistic values back into the medical encounter, allowing for the establishment of a knowing relationship. Research has shown that empathy helps to build trust, is linked to better diagnoses, improves patient outcomes, and decreases malpractice lawsuits.

Now that we have better understanding of the importance of empathy, how do we measure it in a clinical setting? The COMLEX-USA Level 2-Performance Evaluation has been assessing interpersonal and communication skills and professionalism of candidates for the past 15 years. And empathy is one of six dimensions assessed. Based on evidence that patients place great value on their human connection to their doctor, there are several guiding principles that support the role empathy plays in patient care:

Empathy is connection.

Attend to the patient both verbally and non-verbally. Listen to them. Make eye contact. Actively respond to their condition or pain. Avoid giving patients the ‘clinical cold shoulder’ by focusing only on their symptoms.

Empathy is curiosity.

This is especially important for young doctors who don’t have a lot of patient experience. Learn from the patient. Explore their illness experience. Discuss their lifestyle, their belief system, their stress levels, what motivates them. Give patients the feeling of being understood.

Empathy is compassion.

This is the ability to imagine what a patient is experiencing without being overwhelmed by their pain or distress. Research has shown that people are selective when expressing empathy towards others. It’s hard to feel compassion, for example, when a patient is difficult, unlikeable, or struggling with unhealthy behaviors that put them at risk. But it’s these patients who are most deserving and in need of our compassion and understanding.

Empathy is not stress.

Stress is in opposition to empathy. It’s difficult to connect to a patient, or anyone for that matter, when we feel anxious and overwhelmed. Likewise, physicians who have difficulty managing their feelings towards patients are themselves at risk. Although the stress of working with patients is an unavoidable part of a physician’s work life, one goal of medical education should be to equip students with the skills to manage stress in healthy ways.

A generation ago, few were talking about the role of empathy in healthcare. But today, cognitive neuroscience has enabled us to look critically at what ignites and motivates our behaviors, including the empathic ones. This new knowledge and learning, coupled with a heightened focus on developing higher quality patient care, shines a bright light on the need and desire for greater empathic engagement. That said, empathy is not what makes good doctors; it’s what makes good doctors even better.

Demonstrating empathy is important to becoming a DO, and since 2004, passing this assessment has been required to obtain the DO degree, move into residency training, and obtain a license to practice osteopathic medicine.


Contributed by Tony Errichetti, PhD  |  Director of Doctor-Patient Assessment  |  NBOME

See All

You may also like

Philadelphia, PA — The National Board of Osteopathic Medical Examiners (NBOME), an independent, not-for-profit organization that provides competency assessments for osteopathic medical licensure and related health care professions, announced the installment of three officers to its board of directors.

In addition to the new officers, the NBOME recognized Dana C. Shaffer, DO, FACOFP, for his service as Board Chair from over the past two years. NBOME President & CEO, John R. Gimpel, DO, MEd, expressed his gratitude for Immediate Past Chair Dr. Shaffer.

The following individuals were elected to serve as officers for the NBOME’s Board of Directors:

Board Chair: Geraldine T. O’Shea, DO 

As the Chair of the NBOME Board, Dr. O’Shea will lead the NBOME’s strategic plan for 2020-2022, ACEL and vision to become the global leader in assessment for the osteopathic medicine and related health care professions. Dr. O’Shea became a member of the NBOME Board in December 2009 and has served on the Awards Committee, COMLEX-USA Composite Examination Committee, Finance Committee, and the Marketing and Communications Task Force. She was installed as Vice-Chair in December 2017. Previously she served as Secretary-Treasurer from 2015-2017, chaired the Finance Committee and currently serves as a member of the Executive Committee, the Compensation Subcommittee, the SAS for GME Outreach Task Force, and as Liaison Committee Chair.

Dr. O’Shea has practiced internal medicine at the Foothills Women’s Medical Center in Jackson, California, since 1998. A 1993 graduate of the Western University of Health Sciences College of Osteopathic Medicine of the Pacific, she completed her internal medicine residency at the Maricopa Medical Center in Phoenix, Arizona. Dr. O’Shea served as president of the Osteopathic Medical Board of California from 2006 to 2012, and in 2013 as president of the American Association of Osteopathic Examiners. She also previously served on the Federation of State Medical Boards (FSMB) Nominating Committee and has served on the FSMB’s Awards, Audit, and Finance Committees.

Dr. O’Shea is a trustee of the American Osteopathic Association (AOA) and serves as chair of the Strategic Planning Committee, the Bureau of Membership and the Membership Value Task Force. Before being appointed to the AOA Board of Trustees, Dr. O’Shea served the AOA in many capacities, including vice-chair of the Bureau on Federal Health Programs and vice-chair of the Council of Women’s Health Issues. As past president of the Osteopathic Physicians and Surgeons of California (OPSC), Dr. O’Shea was chair of the California delegation to the AOA’s House of Delegates between 2006 and 2014 and received the OPSC’s Lifetime Achievement Award in February 2012.

Board Vice-Chair: Richard J. LaBaere, II, DO, MPH, FAODME

Geraldine T. O'Shea

Dr. LaBaere is the NBOME’s newly installed Vice Chair for 2020-2022. He joined the NBOME Board in 2010 and served on the organization’s Blue Ribbon Panel on Enhancing COMLEX-USA and the Marketing and Communications Task Force. Dr. LaBaere previously served as the Secretary-Treasurer on the Board of Directors, Chairs the Finance Committee and Chairs the COMLEX-USA Composite Examination Committee. He also serves on the Compensation Committee, and the Executive Committee, as well as the SAS for GME Outreach Task Force.

Dr. LaBaere is currently the associate dean for postgraduate training, the osteopathic postdoctoral training institution academic officer and an adjunct clinical professor of family medicine at A.T. Still University–Kirksville College of Osteopathic Medicine (ATSU-KCOM) in Missouri. He has served as regional assistant dean for the Michigan region at the Genesys Regional Medical Center in Grand Blanc, Michigan, where he began his career in 1993 in private practice and graduate medical education.

He has served in various roles as family medicine residency program director, director of medical education and designated institutional official for over 25 years. Dr. LaBaere has presented to local, state and national audiences and has received a number of awards, including the 2006 Osteopathic Family Physician of the Year by the Michigan Association of Osteopathic Family Physicians, and was inducted into the American Osteopathic Association’s Mentor Hall of Fame in 2007 and as a fellow in the collegium of the Association of Osteopathic Directors and Medical Educators (AODME) in 2008. He served as AODME president in 2013. Dr. LaBaere is certified by the American Board of Osteopathic Family Physicians. He earned his Bachelor of Science and master of public health degrees from the University of Michigan in Ann Arbor and his DO degree from the Michigan State University College of Osteopathic Medicine.

Board Secretary-Treasurer: Lori A. Kemper, DO, MS, FACOFP

Richard J. LaBaere II

Dr. Kemper will serve as the NBOME Secretary-Treasurer for 2020-2022. A member of the NBOME Board, Dr. Kemper is also a member of the Test Accommodations Committee and the Awards Committee.

Dr. Kemper’s more than 30-year career encompasses both independent practice and graduate medical education. Since 2007, she has been the dean of Midwestern University, Arizona College of Osteopathic Medicine, where she previously served as associate dean of graduate medical education and associate professor in the department of family medicine. She currently serves as a commissioner to the American Osteopathic Association (AOA) Commission on Osteopathic College Accreditation (COCA) and is the chair of the Board of Deans of the American Association of Colleges of Osteopathic Medicine (AACOM).

Dr. Kemper earned her DO degree from the Kirksville College of Osteopathic Medicine in 1981 and a master’s degree in biological sciences from Arizona State University. She is board certified in family practice and is a fellow of the American College of Osteopathic Family Physicians. Dr. Kemper has practiced as a family physician since 1982, starting her career with the National Health Service Corps, where she provided care for the underserved population in south Phoenix, Arizona. She served as director of medical education and as the family medicine residency program director for Tempe St. Luke’s Hospital in Tempe, Arizona, from 1993 to 2007, where she also served as chief of staff from 2005 to 2007.

Dr. Kemper has earned numerous awards, including the Arizona Osteopathic Medical Association (AOMA)’s Excellence in Osteopathic Medical Education award (2010), Phoenix Magazine’s “Top Doc” award (2007, 1997), and the AOMA’s Physician of the Year Award (2006). Dr. Kemper served as the program director for OMED 2011, the annual Osteopathic Medical Conference and Exhibition. She chairs the Professional Education Committee for the Arizona Osteopathic Medical Association, of which she is past president.

PHILADELPHIA, PA. The National Board of Osteopathic Medical Examiners’ (NBOME) Board of Directors appointed 12 new leaders to their National Faculty chair positions.

The NBOME’s National Faculty is made up of over 700 active, engaged members from across the nation. These thought leaders have diverse expertise in all osteopathic health professions and specialties, osteopathic medical education and evaluation, and osteopathic physician licensure and regulation. Together, they serve on operational committees that review exam criteria, write and review exam items, and serve other roles in our mission to protect the public through rigorous competency assessment of osteopathic medical practitioners.

On behalf of the NBOME Board of Directors and staff, we would like to congratulate and welcome the following National Faculty members who have been appointed to 2020 National Faculty Chair positions.

Foundational Biomedical Sciences Division Chair, Pharmacology

Adrienne Z. Ables, PharmD, MS, FNAOME – Virginia College of Osteopathic Medicine Carolinas Campus


COMAT Examination Chair, Emergency Medicine

Thomas E. Benzoni, DO, EM, AOBEM, FACEP – Des Moines University College of Osteopathic Medicine


Clinical Decision-Making and Key Features Chair

Peter F. Bidey, DO, MSEd – Philadelphia College of Osteopathic Medicine


COMLEX-USA Level 1 Examination Chair

Joyce A. Brown, DO, CHSE – Touro College of Osteopathic Medicine – Middletown


Clinical Sciences Department Chair, Radiology and Diagnostic Imaging  

Samuel M. Cosmello, DO, RPh – Private Practice, Fayetteville, NC


Clinical Sciences Department Chair, Surgery, Surgical Specialties and Anesthesia

Jay M. Crutchfield, MD, FACS – A.T. Still University School of Osteopathic Medicine in Arizona


Foundational Biomedical Sciences Division Chair, Biochemistry

Martha A. Faner, PhD – Michigan State University-College of Osteopathic Medicine


Clinical Science Department Chair, Preventive Medicine and Health Promotion

Joyce M. Johnson, DO, MA – Georgetown University


Foundational Biomedical Sciences Division Chair, Physiology

Kathleen P. O’Hagan, PhD – Midwestern University Chicago College of Osteopathic Medicine


COMAT Examination Chair, Surgery

Michelle M. Sowden, DO – University of Vermont College of Medicine


Foundational Biomedical Sciences Department Chair

Robert J. Theobald, PhD – A.T. Still University-Kirksville College of Osteopathic Medicine


Clinical Sciences Preventive Medicine and Health Promotion Division Chair, Biostatistics and Epidemiology

Eduardo Velasco, MD, MSc, PhD – Touro University College of Osteopathic Medicine – California


“Our National Faculty is crucial to our mission of protecting the public,” said Sandra Waters, MEM, NBOME’s Vice President for Collaborative Assessment & Initiatives. “The NBOME is honored to have such talented and committed thought leaders that represent all aspects of clinical and foundational biomedical science disciplines.”

About the NBOME

The National Board of Osteopathic Medical Examiners (NBOME) is an independent, not-for-profit organization that provides competency assessments for osteopathic medical licensure and related health care professions. NBOME’s COMLEX-USA examination series is a requirement for graduation from colleges of osteopathic medicine and provides the pathway to licensure for osteopathic physicians in the United States and numerous international jurisdictions.

PHILADELPHIA, PA. The National Board of Osteopathic Medical Examiners (NBOME), an independent, not-for-profit organization that provides competency assessments for osteopathic medical licensure and related health care professions, salutes Dana C. Shaffer, DO, on the culmination of his two-year term as Chair of the NBOME Board of Directors. At their Board of Directors Annual Meeting and Gala Dinner in December, the NBOME recognized Dr. Shaffer for his exceptional leadership and service.

“My time with the NBOME and all the wonderful folks here have been instrumental in my career,” said current NBOME Board Chair, Dana Shaffer, DO. “It’s been a privilege to serve as the Board Chair, and I’m confident the future is bright for the NBOME and its leadership.”

Dana C. Shaffer, DO, the dean at the Kentucky College of Osteopathic Medicine (KYCOM) in Pikeville, Kentucky, was installed as Chair of the NBOME Board in December 2017. He serves as a member of the Executive Committee and Compensation Subcommittee. He previously served as Vice-Chair and Secretary-Treasurer of the NBOME Executive Committee, as a member of the Test Accommodations Committee and as chair of the Finance Committee and the Liaison Committee. Prior to serving as dean at KYCOM, Dr. Shaffer served as senior associate dean, and also the senior associate dean of clinical affairs at Des Moines University College of Osteopathic Medicine from 2006 to 2013. Prior to that, Dr. Shaffer practiced the complete spectrum of rural family medicine in rural Iowa for 22 years, including osteopathic manipulative medicine, obstetrics and emergency medicine, as well as both inpatient and outpatient care.

“Dr. Shaffer has excelled as a leader for us at the NBOME,” said NBOME President & CEO, John R. Gimpel, DO, MEd. “His wisdom, judgment, experience and commitment have been great assets for us, and we thank him for his countless contributions to the NBOME and our mission.”

PHILADELPHIA, PA. The National Board of Osteopathic Medical Examiners (NBOME), an independent, not-for-profit organization that provides competency assessments for osteopathic medical licensure and related health care professions today announced Kim E. LeBlanc, MD, PhD, as the recipient of their 2019 Clark Award for Patient Advocacy. The award was created to recognize those who have gone above and beyond the call of duty in their advocacy for patient safety, patient protection and quality of care. It recognizes those who have worked to assure patients that DOs have qualified for licensure by virtue of having passed the licensure examinations (COMLEX-USA) that are designed for and have evidence for validity for the practice of osteopathic medicine. Dr. LeBlanc was presented with the award as part of NBOME’s Annual Board Meeting and Gala Dinner.

“The NBOME has been so important to my professional life, and I couldn’t be more honored to have been chosen as their next Clark Award winner. I appreciate all of my colleagues at the NBOME and in the osteopathic medical profession. Nothing would be possible without all of your hard work,” said Kim E. LeBlanc, MD, PhD. “I have worked with colleagues at the NBOME and on the COMLEX-USA examinations since my tenure at the Louisiana State Board of Medicine and have been proud to endorse the use of COMLEX-USA for osteopathic medical licensure in Louisiana and across the nation.”

Dr. LeBlanc recently transitioned back to Louisiana from his position as Executive Director of the Clinical Skills Evaluation Collaboration (CSEC), which creates and administers the United States Medical Licensing Examination (USMLE) Step 2 Clinical Skills examination. He was instrumental in advocating for equivalent licensure for DOs and acceptance of COMLEX-USA when he served as President of the Louisiana State Board of Medicine and on the Board of Directors of the Federation of State Medical Boards. Dr. LeBlanc was in the private practice of family medicine and sports medicine for nearly 20 years where he became involved in academic medicine. He also served as team physician for the University of Louisiana Lafayette, several US Olympic teams, and several professional baseball, soccer, ice hockey, and football teams.

“Dr. LeBlanc’s role in advocating for DOs and COMLEX-USA is a vital piece of NBOME’s exciting history, so it is fitting that he should be awarded the NBOME Clark Award for Patient Advocacy in this our 85th Anniversary year,” said NBOME President & CEO, John R. Gimpel, DO, MEd. “Dr. LeBlanc has made a major difference in health care and medical licensure.”

PHILADELPHIA, PA. The National Board of Osteopathic Medical Examiners (NBOME), an independent, not-for-profit organization that provides competency assessments for osteopathic medical licensure and related health care professions, today announced Gary L. Slick, DO, MA, as the recipient of their 2019 Santucci Award. Thomas F. Santucci, Jr., DO, was the NBOME’s President and Chair of the Board from 1985 to 1987, at a pivotal time of change for the organization. The Santucci Award is the NBOME’s highest honor, awarded only to an individual who has distinguished him or herself by their sustained outstanding contributions to the mission of the NBOME, protecting the public via competency assessment. Since 1978, Dr. Slick, has served in numerous roles at the NBOME, including as Chair of the Board of Directors from 2015-2017.

“I’m truly humbled to have been chosen by my peers to receive The Santucci Award,” said Gary L. Slick, DO, MA. “I want to thank all of my fellow board members at the NBOME for this distinction. Everything the NBOME has accomplished has been a team effort, and I look forward to what’s to come in the future.”

Dr. Slick currently serves as the designated institutional official of the graduate medical education residency and fellowship programs under sponsorship of the Oklahoma State University Center for Health Sciences (OSU-CHS), the chief academic officer of the Osteopathic Medical Education Consortium of Oklahoma, professor of medicine at the OSU-CHS, and member of the board of directors of the Accreditation Council for Graduate Medical Education.

A nephrologist, Dr. Slick has served the NBOME in numerous volunteer capacities over four decades, including as an item writer, test construction committee member, and final exam reviewer in physiology and internal medicine for COMLEX-USA examinations. He has served as committee chair of numerous NBOME Board and testing committees, including as the inaugural Chair of the COMAT internal medicine examination, tests now used in clerkship evaluation at almost every college of osteopathic medicine nationwide. Dr. Slick has been a member of the NBOME Board of Directors since 2005 and was installed as Board Chair in 2015.

“We are so pleased to recognize Dr. Slick with the NBOME’s highest honor,” said NBOME President & CEO, John R. Gimpel, DO, MEd. “Dr. Slick has made immeasurable contributions to the NBOME and the osteopathic medical profession since the 1970s, and how fitting that he should be a 2019 Santucci Award Winner in our 85th Anniversary year.”

As we continue to reflect on our 85th anniversary, we discussed the most memorable achievements in the history of NBOME with our Board of Directors.


What would you identify as NBOME’s greatest accomplishment since its founding?

Richard LaBaere II, DO, MPH: NBOME’s greatest accomplishment lies in the establishment of the COMLEX-USA series and its reputation as a nationally and internationally recognized assessment tool that is valid, reliable and relevant to what osteopathic physicians do. The NBOME has been tireless in implementing best practices in test development and testing, has made research a priority, and has employed a forward-looking approach to improvement and service.

Gary Slick, DO, MA: NBOME’s greatest accomplishment to date is being recognized nationally and at the federal and state level as one of two accepted licensing agencies in the U.S.

John Thornburg, DO, PhD: There has been significant evolution and growth from the original small ‘mom and pop’ organization, with only a few full-time employees, to what it is today. NBOME and COMLEX-USA have had much to overcome over the years and they have done so with tremendous grace.


What are you most proud to have been a part of since becoming involved with the NBOME?

Richard LaBaere II, DO, MPH: I am most proud of our thoughtful and deliberate growth in both capacity and relevance in the assessment and services NBOME provides. The implementation and further development of COMAT, the launch of a new testing blueprint, and the opening of a new clinical skills testing center are just a few great examples of strategic growth which has helped us in fulfilling our mission to protect the public. NBOME has been a reliable, steadfast partner to many affiliated organizations as well, willing and able to help others move forward during turbulent times of change.

William Anderson, DO: One of the most significant accomplishments that I am glad to have been a part of is the high standards that NBOME set for the profession.

John Thornburg, DO, PhD: One of NBOME’s biggest accomplishments has been the recent adaptation of COMLEX-USA to a competency-based blueprint with the highest standards of quality, enhancing our esteemed status as the one-and-only osteopathic medical assessment for licensure.


What is the biggest challenge you have seen the NBOME face and overcome?

William Anderson, DO: The USMLE examination has long been recognized as the licensure exam that allows medical students to practice independently. As a result, NBOME and COMLEX-USA have faced a great deal of competition and challenge while working to establish a unique path for osteopathic medical licensure. The fact that NBOME was able to meet these challenges and emerge successful as an equivalent evaluation, speaks to the high standards of COMLEX-USA, and its appropriateness as a tool to measure and assess osteopathic medical knowledge.

John Thornburg, DO, PhD: Over the years, NBOME has faced many significant challenges and has worked tirelessly to gain respect and acceptance across the medical community and the general public. The quality of our assessment products has been key to our success, as well as NBOME’s efforts to strengthen relationships with our many stakeholders, particularly residency program directors, FSMB, NBME, AOA, AOE, and COM deans.


What is the most dramatic change you have seen during your tenure at the NBOME?

Richard LaBaere II, DO, MPH: In the years since I have become a part of the NBOME, I’ve noticed the incredible growth in the understanding of COMLEX-USA in the past decade alone; more and more know about COMLEX-USA and how it reflects the performance of those training in the osteopathic profession.

Gary Slick, DO, MA: Originally, the NBOME only had one examination: COMLEX-USA. In recent years, however, there has been an explosion in the number of assessments developed—from COMSAE, to COMAT, to CORRE. These new assessments have allowed new knowledge to be assessed and a larger number of stakeholders to take advantage of our examinations, including students, COMs, physicians, etc.

John Thornburg, DO, PhD: When I first became involved with the NBOME, there were ten COMs, some with class sizes of less than 100. The subsequent increase in the number of COMs and their class size has resulted in a huge increase in revenue, as well as the need for more NBOME staff to meet this demand.  While the COMLEX-USA series remains NBOME’s primary product, the role new assessment products has played is far beyond what could have been foreseen 10 years ago.


What comes next for NBOME? What are you most excited about?

Richard LaBaere II, DO, MPH: I am really excited about the development of new technology platforms like CATALYST to sustain and drive easy access and expedite ways to continue life-long learning. I am also excited about how we can use assessment data in novel ways to assist both students and residency program directors in achieving the very best match possible in graduate medical education, especially in light of the single accreditation system for graduate medical education in 2020.

William Anderson, DO: I anticipate NBOME’s next steps will be closely tied to the single accreditation system for graduate medical education and the ACGME, respectively. As the GME landscape changes, the osteopathic medical community will need to adapt alongside it, working to earn a place in the new ACGME system and position itself as an asset to the practice of medicine.



Richard J. LaBaere II, DO, MPH serves as the Secretary-Treasurer on the Board of Directors, chairs the Finance Committee and vice-chairs the COMLEX-USA Composite Examination Committee. Dr. LaBaere is currently the associate dean for postgraduate training, the osteopathic postdoctoral training institution academic officer and an adjunct clinical professor of family medicine at A.T. Still University–Kirksville College of Osteopathic Medicine (ATSU-KCOM) in Missouri. 


Gary L. Slick, DO, MA is Immediate Past Board Chair on the Board of Directors and member of the NBOME’s Compensation Subcommittee and Nominating Committee. Dr. Slick also currently serves as the designated institutional official of the graduate medical education residency and fellowship programs under sponsorship of the Oklahoma State University Center for Health Sciences (OSU-CHS).


John E. Thornburg, DO, PhD serves as a National Faculty Chair in Foundational Biomedical Sciences at NBOME. Dr. Thornburg also currently serves as Professor Emeritus in the  Pharmacy and Toxicology department at Michigan State University. In 2012, he was awarded the AOA’s Distinguished Service Certificate during AOA OMED.


William G. Anderson, DO, was an active member of the NBOME Board of Directors from 2003 through 2014 and was member of its Executive Committee from 2007 to 2010. Dr. Anderson is a professor of surgery and senior adviser to the dean at the Michigan State University College of Osteopathic Medicine (MSU-COM). 



See All

You may also like

Horber DT, Waters S.  CATALYST: Transforming Physicians’ Assessment into Learning.  Presentation delivered the 2019 Meeting of the American Board of Medical Specialties, Chicago, IL, September 2019.


Session Summary

For years, physicians have criticized maintenance of certification as an ineffective requirement that is irrelevant to practice and cost-prohibitive. In response, several specialty Boards have implemented longitudinal assessment formats to ensure continuing physician competency. The National Board of Osteopathic Medical Examiners (NBOME) has developed CATALYST, an assessment platform supported by findings from cognitive learning that emphasize the value of retrieving previously learned content, providing immediate feedback, spacing questions over time, and interleaving topics in order to produce more complex and durable learning.

During 2017 and 2018, in conjunction with the American Osteopathic Association (AOA), the NBOME conducted 16-week pilot studies with three osteopathic specialty boards. Results provided overwhelming support for the CATALYST assessment platform: of the 196 diplomates surveyed, 95% agreed or strongly agreed that CATALYST would help them stay current in their specialties and over 98% preferred the CATALYST format to traditional Board examinations. A significant pilot finding was that different specialty Boards have different requirements and expectations, as did the physicians within the specialty. In order to provide greater customization within CATALYST, the NBOME is implementing a new CATALYST platform, with a follow-up study.

The presentation will describe CATALYST as an assessment format, summarize NBOME’s development path including the new platform and pilot outcomes, and describe alternative uses for CATALYST. Lessons learned from this journey and planned next steps will provide insights to organizations seeking alternative modes of ongoing physician assessment. Audience participation and questions will be encouraged


Learning Objectives

By attending this presentation, attendees will be able to:
• Describe CATALYST’s basis in cognitive learning theory
• Summarize the outcomes reported in the CATALYST pilot studies
• Describe next steps for CATALYST

Shaffer D, Waters S.  Ensuring Ongoing Physician Competency with CATALYST.  Presentation delivered at the 2019 Meeting of the International Association of Medical Regulatory Authorities, Chicago, IL, September 2019.



The purpose of maintenance of certification in the United States is to ensure ongoing physician competency in order to safeguard patient safety. In recent years, maintenance of certification, with its generally unpopular traditional, high-stakes, multiple-choice examination, has been criticized as a cost-prohibitive process that is not relevant to physicians’ clinical practice. In response, some specialty Boards, among them the American Board of Anesthesiology, the American Board of Pediatrics, and the American Board of Internal Medicine, have implemented alternative assessment formats that focus on facilitating physician’s continued learning.

In keeping with its mission, the National Board of Osteopathic Medical Examiners (NBOME) has developed CATALYST, a longitudinal assessment designed to provide specialty Boards with a potential means of assessing ongoing physician competency. CATALYST is based on findings from cognitive learning which emphasize the retrieval of previously learned content, providing immediate feedback, spacing questions over time, and interleaving topics. The NBOME, in conjunction with the American Osteopathic Association (AOA) conducted 16-week pilot studies to gather data concerning how diplomates from three osteopathic specialty boards viewed the CATALYST assessment platform and the assessment process. Participants were recruited from the American Osteopathic Board of Internal Medicine (AOBIM), the American Osteopathic Board of Pediatrics (AOBP), and the American Osteopathic Board of Obstetrics and Gynecology (AOBOG).

Results indicated overwhelming support for the CATALYST platform: of the 196 diplomates surveyed, 95% agreed or strongly agreed that CATALYST would help them stay current in their specialty and 91% thought it would help them take better care of their patients. Over 98% stated that they would rather answer a fixed number of CATALYST questions periodically than take the traditional recertification examination.

This presentation will describe the use of CATALYST as an assessment format and summarize the outcomes of the pilot studies and their outcomes. As well, next steps for CATALYST, including the development of a new technology platform, will be discussed. Lessons learned will assist participants in considering exploration or potential enhancement of similar programs in their jurisdictions.


Behavioral Learning Objectives

By attending this presentation, attendees will be able to:
• Explain the elements of cognitive learning theory that support CATALYST as a longitudinal assessment.
• Describe the outcomes of the pilot studies with diplomates of three osteopathic specialty boards.
• Describe next steps for CATALYST.



The American Board of Anesthesiology – Part 3: MOCA Minute®. http://www.theaba.org/MOCA/MOCA-Minute. Accessed February 4, 2019.

The American Board of Pediatrics – MOCA-Peds. https://www.abp.org/mocapeds Accessed February 4, 2019.

Madewell,JE, Hattery, RR, Thomas SR, Kun LE, Becker GJ, Merritt C, Davis, LW, American Board of Radiology: Maintenance of Certification. Radiology. 2005 234(1): 17-25. Published Online:Jan 1 2005https://doi.org/10.1148/rg.251045979.

Brown PC, Roediger HL, & McDaniel MA. Make it Stick: The Science of Successful Learning. Cambridge MA: Harvard University Press, 2014.

Moulton CA. Dubrowski A, MacRae H, Graham B, Grober E, & Reznick R. Teaching Surgical Skills: What kind of Practice Makes Perfect? A Randomized, Controlled Trial. Ann Surg. 2006 Sep;244(3):400-9.

Mirigliani L, Lorion, A. When Life Gets in the Way: Getting SPs out of Their Heads and into the Role. Presentation delivered at the 2019 Association for Standardized Patient Educators Annual Conference, Orlando, FL, June 2019.



We ask Standardized Patients (SPs) to put the outside world aside during encounters and focus only on what is happening in the room, but this is not always easy, even for the best SPs. SPs being distracted by real-life concerns may lead to struggle with portrayal, recall, late arrivals or callouts, or unpredictable responses to co-workers or feedback. Without attention, an SP may continue to struggle—in work and out. Yet, the integrity of the simulation must be protected, and sometimes the SP’s employment will be in jeopardy. This session will help participants be prepared to recognize potential signs of SPs who are struggling emotionally; be receptive to having conversations with those SPs; identify tools and resources that can assist the SPs; be able to set limits; and be able to hold to those limits, even if it means the SPs going through corrective action, up to and including termination.



An SP’s emotional state can have serious repercussions, impacting other SPs and staff as well as the SP, making portrayal of some cases more difficult and potentially risking an examination’s standardization. Being aware that shifts in emotional state are a possibility, being able to address the situation with the SP, and having tools and resources readily available for the SP will help trainers and administrators intervene, address the root cause of unusual behavior, and potentially assist a valued SP. Setting and holding to limits will help the trainer or administrator protect him or herself and the simulation.



Participants will be able to:
1. Help SPs recognize what may be “triggers” for them, including having to simulate something they are experiencing in real life.
2. Encourage SPs to do emotional “self-checks” prior to simulations.
3. Start a potentially uncomfortable conversation with the SP.
4. Have tools and resources at hand.
5. Set limits to preserve the integrity of the simulation.


Intended Discussion Questions

1. Have you been in this situation before, on either side of the conversation? If so, what did/did not go well and what did you learn?
2. What tools have you used/could you use to help SPs dealing with emotional issues to focus on the simulation and their responsibilities to the center, their co-workers, and the students?
3. What resources are available at your institution that could assist SPs struggling with emotional difficulties?
4. Given your role, how will you prepare your colleagues and share information?



Spencer, John and Jill Dales, “Meeting the Needs of Simulated Patients and Caring for the Person Behind Them?” Medical Education 40.1 (2006): 3-5.

Bokken, Lonneke, Van Dalen, Jan, and Jan-Joost Rethans, “Performance-related stress symptoms in simulated patients,” Medical Education 38.10 (2004): 1089-1094.
Varlander, Sara, “The Role of Students’ Emotions in Formal Feedback Situations,” Teaching in Higher Education 13.2 (2008): 145-156.

Lewis, Karen L.,Carrie A. Bohnert,Wendy L. Gammon, Henrike Hölzer, Lorraine Lyman, Cathy Smith, Tonya M. Thompson, Amelia Wallace, and Gayle Gliva-McConvey “The Association of Standardized Patient Educators (ASPE) Standards of Best Practice (SOBP)” Advances in Simulation 2:10 (2017).

“Building Workplace Resilience.” Guidance Resources Online. 2018. ComPsych Corporation. Retrieved from https://www.guidanceresources.com/groWeb/s/article.xhtml?nodeId=809859&conversationContext=1

Ronkowski, E. Collaborative Cognitive Item Mapping Paper presented at the 2019 Conference of the American Board of Medical Specialties, Chicago, IL, September 2019.


Learning Objectives

Attendees will leave this presentation with ideas on how to innovate traditional item-writing workshops through Collaborative Cognitive Item Mapping (CCIM). They will also have an understanding of how to implement the Plan-Do-Check-Act (PDCA) model to innovate test development in a data-driven manner.

Session Summary

Collaborative Cognitive Item Mapping (CCIM) is a dynamic, new form of item development that builds on the literature in automatic item generation (AIG). In CCIM, a small group of subject matter experts (SMEs) develops items that assess essential testing objectives related to a clinical presentation, such as neck masses. The SMEs select high-frequency, high-impact diagnoses related to the topic, then map out patient findings and clinical decision-making processes. An item editor transforms the map into a set of items.

CCIM is beneficial because it is collaborative, systematic, and intentional. Independent item writing (IIW) can be challenging for physicians who are used to constant interactions and movement; CCIM allows SMEs to develop items without the intimidation of the blank page. The systematic approach of CCIM ensures that items include necessary details, such as duration of symptoms, and results in better distractors as SMEs think through plausible options for multiple diagnoses at the same time. With IIW, it is difficult to control for SMEs writing similar items on the same topics. With CCIM, a small group, rather than an individual, decides the testing objectives and diagnoses; this results in items that reflect the breadth and scope of the topic.

To develop CCIM, we implemented the Plan-Do-Check-Act (PDCA) model. At a pilot workshop, participants wrote items through IIW and CCIM. Through a collaboration of psychometricians, editors, and test developers, we fast-tracked a group of nearly 100 items for pretesting, and the results showed no significant statistical difference in item performance between the CCIM and IIW items. Our preliminary findings also suggest that CCIM can boost item production as much as 30% compared to traditional workshops.

Beyond the NBOME board, executive leadership, and even our 700+ member National Faculty, there are dozens of staff members and collaborators helping us protect the public in their roles behind the scenes. To commemorate our anniversary, we turned to some NBOME insiders for their insights at the work and  culture of the NBOME. Last week we heard from, Shirley Bodett, and Dennis J. Dowling, DO.  This week a few more long-time NBOME staff and collaborators shared their perspectives of our work over the years.



Sydney Steele, JD has been NBOME’s General Counsel for over 25 years. With a deep knowledge of the Americans with Disabilities Act (ADA), Sydney has been instrumental in establishing our modern Test Accommodations practices. In 2019 he won NBOME’s Santucci Award for a career of sustained contributions to the mission of the NBOME.


What was going on at the organization when you started?

When I started As I recall there were only about 12 or so employees in the Conshohocken office. There was no full-time President. There was no office in Chicago. There was no PE exam. And there were very little if any ADA claims by test-takers.

What were some of the biggest shifts in the NBOME during your time here?

The NBOME has become substantially more sophisticated in their testing practices, including Level 2-PE, and expanded testing into related health care professions.  Technology has driven a lot of that, as did my role in developing our ADA accommodations for students with disabilities.

What is your fondest memory of your time with the NBOME?

Working with the talented and dedicated people at the NBOME, and watching the organization grow from about 12 or so employees without a full-time president, to what it is today.



NBOME’s principal Research Associate, Yi Wang, MS has been with the organization for 18 years. She was awarded the President’s Award for Outstanding Service.



How has technology changed how the organization works?

When I started working with NBOME, all of our assessments used paper and pencil.  In 2005, we moved COMLEX-USA to a computer-based format and began developing COMLEX-USA Level 2-PE. Following that we built and entire portfolio of computer-based, and even web-based assessments.

What’s your favorite thing about working at the NBOME?

Everybody probably says the same thing, but it’s really true that the people that make up the NBOME are really the best thing about it. I’ve been here nearly 20 years, and I’ve seen us grow from 20 employees in 2001 to nearly 130 today, but I still know I can count on every member of my team.



The first face you see when you enter our Philadelphia corporate offices is Rachel Maxwell. She keeps us running like a well-oiled machine as our Coordinator for Operations. She has been with the organization for 15 years.



What’s changed since you’ve worked here?

When I first started in 2004 we were just opening the testing center for the COMLEX-USA Level 2 PE exam.  Nothing was done electronically the first few years.  Each students filled out a paper application and mailed a check in to pay for their exam.  We manually registered students for each testing date were on paper and then uploaded into the system to run the exam.  We sent all score reports via snail mail as well. We’ve come a long way since then. Computers have simplified a lot of processes, but they still keep us busy.

What is your fondest memory of your time with the NBOME?

Do I have to pick one?  I’ve been here so long that I have so many fond memories of NBOME.  There were times when we were smaller when we’d hold company events at the company president’s house,  a company outing for an afternoon of snacks and swimming, the entire staff even attended dinner with the board.

What’s your favorite thing about working at the NBOME?

I have met some very interesting people while working with our National Faculty, but more importantly I have made some wonderful friends with my coworkers as well.  I think that no matter what, the NBOME is growing, but I still feel like we maintain a small company type feel where everyone knows each other, cares about each other and like any family, we go through our ups and downs, good and bad.


See All

You may also like

We are pleased to congratulate Karen J. Nichols, DO, former president of the AOA and vice chair of the Accreditation Council for Graduate Medical Education (ACGME) board, for being named chair-elect of the ACGME.

“I have had the honor of serving on the ACGME board for five years and have clearly seen the laser-focus of the entire organization on our mission – ‘…to improve health care and population health by assessing and advancing the quality of resident physicians’ education through accreditation,'” said Dr. Nichols.

Dr. Nichols has a long, decorated history in osteopathic medicine. She served as the first woman president of the AOA, president of the Arizona Osteopathic Medical Association and president of the American College of Osteopathic Internists.

From 2002-2018, Dr. Nichols was dean of Midwestern University Chicago College of Osteopathic Medicine. Prior to that, she was assistant dean, post-doctoral education and division director, internal medicine, at the Midwestern University Arizona College of Osteopathic Medicine. A frequent national speaker on leadership, end-of-life care and osteopathic medicine, Dr. Nichols has also received seven honorary degrees and top awards from the AOA and the American Association of Colleges of Osteopathic Medicine.

She currently holds several positions at the ACGME.  In addition to her newly appointed post, she is a member of the executive committee, chair of the governance committee, and a member of the standing committees for education, policy and monitoring.

“The ACGME has worked to transition to an accreditation model that encourages excellence and innovation. My vision is to work with our fine ACGME board, staff and volunteers to see that the ACGME continues to move forward while being thoughtful and current.”

All of us at the NBOME recognize Dr. Nichols’ accomplishments, and we sincerely wish her the best of luck in her new role with the ACGME.

Read more about Dr. Nichols’ role as chair-elect of the ACGME.

In honor of the NBOME’s 85th anniversary since our founding, we sat down with some inspirational members of the osteopathic medical community to discuss their thoughts and perceptions of the NBOME over the years and now.


As NBOME celebrates 85 years of osteopathic medical assessment, how do you feel the organization has impacted the osteopathic medical profession over recent decades?

John Potts, MD: The NBOME’s examinations, developed by their many highly capable and dedicated volunteers, have continued apace of the rapid advance of medical knowledge. As such, the NBOME has pushed both osteopathic medical students and the colleges of osteopathic medicine to ever-higher achievement.

Thomas Cavalieri, DO: The NBOME has impacted the osteopathic medical profession through its commitment to excellence and its steadfast adherence to protecting the public. Fulfilling this mission derives from NBOME’s ability to create an exam that truly integrates osteopathic principles and practice while providing evidence for the need for a distinct profession to have a distinct licensure exam.

Bill Burke, DO: The NBOME, through the actions of its Board and staff, has made an invaluable contribution to the growth and development of the osteopathic medical profession. The ability of DOs to obtain licensure in all 50 states, is in large part due to the development and continuous modernization of the COMLEX-USA series. It is exciting to see the innovation coming from this organization, which will assist practicing physicians in maintaining their board certification through platforms like CATALYST.

William Mayo, DO: Throughout the entirety of its history, the NBOME has defended the distinction of DOs and our approach to our patients—sometimes even against strong opposition. Psychometrically valid, defensible exams, such as COMLEX-USA, provide a strong case to be made on behalf of the profession, and have been endorsed by a number of organizations.


What advice would you give the NBOME as it completes its first 100 years between now and 2034?

Karen Nichols, DO, MA: I would encourage the NBOME to continue holding the bar high in order to ensure that qualified osteopathic physicians are prepared to serve the public.

John Potts, MD: These times are challenging in many ways and I can only predict more challenging times ahead for medical education, both osteopathic and allopathic. I expect the NBOME will continue to fulfill its mission as it has in the past, and continue to uphold the standards that further enable protecting the public.

Humayun Chaudhry, DO: NBOME faces the same challenges confronting all testing entities: the need to demonstrate the continued value of independent assessment as a critical adjunct to medical education and training. This is particularly important at a time when the broader environment seems less amenable to regulation overall.

William Mayo, DO: I would recommend that the NBOME continues to collaborate with the AOA, AACOM and the FSMB to promote distinctiveness across the continuum.

Thomas Cavalieri, DO: It is my hope that the NBOME remains steadfast in its commitment to protecting the public and assuring continued high-quality examinations that truly reflects the essence of osteopathic medicine.



John R. Potts III, MD, is the Senior Vice President, Surgical Accreditation at the Accreditation Council for Graduate Medical Education (ACGME). Dr. Potts also serves as an adjunct professor of Surgery at the University of Texas Houston Medical School (UTHMS). He has also served on the ACGME’s Committee on Innovation in the Learning Environment and on the Standing Panel for Accreditation Appeals in the specialty of surgery.


Thomas A. Cavalieri, DO, is the dean at Rowan University School of Osteopathic Medicine and also serves as a professor of medicine and Osteopathic Heritage Endowed Chair for Primary Care Research. Dr. Cavalieri is a past chair on the NBOME’s Board of Directors, and a longtime National Faculty leader. He was first recruited to the National Faculty in the late 1980s as an exam writer, and oversaw the launch of the COMLEX-USA Level 2-PE in 2004.


Bill Burke, DO, is the Dean of the Ohio University Heritage College of Osteopathic Medicine-Dublin Campus and Chair of Osteopathic International Alliance. He served as a trustee of the American Osteopathic Association (AOA) and as the chair of its departments of Educational Affairs, Governmental Affairs, and Research and Development, as well as its Bureau of Communications and Committee on AOA Governance and Organizational Structure. He is a founding director of the International Primary Care Educational Alliance.


William S. Mayo, DO, was president of the American Osteopathic Association (AOA) for 2018–2019. Throughout his tenure, Dr. Mayo has served the AOA in many capacities. Additionally, Dr. Mayo is a past president of the Mississippi Osteopathic Medical Association and the Mississippi EENT Society. He has served on the Mississippi State Board of Medical Licensure since 2006 and was president from 2010-2012.


Karen J. Nichols, DO, MA, MACOI, CS, is the chair elect of the Accreditation Council for Graduate Medical Education board of directors, and has served as president of the American Osteopathic Association, president of the Arizona Osteopathic Medical Association (AOMA), and president of the American College of Osteopathic Internists, being the first woman to hold all of those


Humayun Chaudry, DO, is the President and Chief Executive Officer of the Federation of State Medical Boards (FSMB) of the United States and was chair of the International Association of Medical Regulatory Authorities (IAMRA) from 2016 to 2018.



See All

You may also like

Beyond the NBOME board, executive leadership, and National Faculty, there are dozens of staff members and collaborators helping us protect the public in their roles behind the scenes. To commemorate our anniversary, we turned to some NBOME insiders for their insights at the work and the culture that’s brought us to where we are today.



Senior Operations Specialist, Shirley Bodett has been with us longer than any other staff member. In her 34 years with us, she’s witnessed many of the changes that have shaped the modern day NBOME.



NBOME: What was happening with the organization when you started?

When I was hired in 1984, there were only two other employees – an Executive Director, Carl W. Cohoon and his assistant Carol Thoma. I was hired to answer phones and do clerical work.

To create exams (one for each discipline), the discipline chair would look through coded cards and select test items based on categories. The staff would then use a word processor and floppy discs to put these questions into a two-column document. This was then sent to a printer, who published the exam books.

Exam scoring was contracted out to the University of Iowa, where score reports were printed and sent to us in triplicate for distribution. We entered candidate names into huge black books by hand, in alphabetical order, by school and graduating class. Later, we entered each candidate’s scores into that same book. When transcripts were ordered, we again opened these books to find the information needed to complete the transcript.

What were some of the biggest changes you’ve seen in the organization?

Computerization has completely changed how we do nearly everything. We’ve brought a lot of our processes in-house, and our vastly expanded staff is much more involved in item creation, editing, and review.

What is your fondest memory of your time with the NBOME?

Working with some of the same people for many years, and getting to know physicians Board members, other Subject Matter Experts, and staff as individuals rather than as defined by their profession.


A lifelong advocate for Osteopathic Manipulative Medicine (OMM), Dr. Dennis J. Dowling, DO, FAAO, our Coordinator for OMM Assessment began working with the NBOME 26 years ago. His work has been instrumental in launching our COMLEX-USA Level 2 PE.



When did you begin working with the NBOME?

I started in the early 90s after becoming a faculty member at NYCOM. One of my professors, Robert E Mancini, PhD, DO was a pharmacologist who became an osteopathic physician as well as a former NBOME president. Dr. Mancini got me involved with a task force he had put together to integrate Osteopathic Manipulative Medicine (OMM) with other questions.


What were some of the biggest changes in your time here?

In 1997 I expressed an interest in the examination of osteopathic manipulative skills and utilizing scoring rubrics to better reflect the process. We came to a major crossroads in the early 2000s that could have easily led to DO students taking a generic test for all medical students, with a tacked on OMT station or two, and no other osteopathic distinctions. But thanks to our work at the time, we now have a fully integrated osteopathic examination that is a much more effective way of testing osteopathic students preparing to enter postgraduate training.

How has technology changed in terms of how we operate?

Technology expands the ability to create much more material and develop alternate processes of testing. It also opens up to greater security risks than ever before. We have to keep up with advancing technology and capabilities, while meeting the needs of the population that we are examining.

What’s your favorite thing about working at the NBOME?

There’s a camaraderie and a sense of purpose that permeates everything we do. We are truly trying to develop the best product for protecting the public and enhancing osteopathic medicine. Without the strength of the NBOME, osteopathic medicine would be a very different and much less effectual profession than exists today.



Next week we’ll catch up with former NBOME General Counsel, Sydney Steele, 2019 NBOME President’s Award winner Yi Wang, and Coordinator of Operations, Rachel Maxwell for their perspectives on 85 years of NBOME.


See All

You may also like

Sheryl Bushman, DO, served as our Chair from 2005-2007, overseeing a great investment in development for the board, the organization, and its products to guarantee their validity at a time of increased scrutiny. From 2011 to 2013, Janice Knebl, DO came on as chair, and oversaw the creation of the Blue Ribbon Panel to modernize COMLEX-USA to a competency based model (which we’ve just finished implementing this year). Both these women have had their own distinct impact in shaping of our organization, they also happen to be the first and second chairwomen of the NBOME.

We sat down with these two important figures to hear their perspective of the NBOME’s 85 year history, and their own part in it.


When AT Still opened the first COM in the 19th century, it was pretty radical that women were able to study there. Famously the first person to take the NBOME’s first exam was a woman (Margaret Barnes). How do you feel about the state of women in the NBOME, and in osteopathic medicine on the whole? Are we living up to the legacy?

Dr. Sheryl Bushman: The NBOME has always treated women with the utmost respect.  It is part of our DNA.  I recall before becoming Chair they asked “What should we call you?  Chair-man Bushman doesn’t sound appropriate.” We’ve simply called the position “Chair” ever since. Even to this day, I see committee Chairs purposefully review the demographics of their members and try to generate membership to reflect the profession considering race, sex, age, location, etc.  This encourages the NBOME’s culture of collaboration, intellectual stimulation, respect and sensitivity. AT Still would be proud to see how far we have come.

Dr. Janice Knebl: I am so very proud that while I was NBOME Chair, the Board of Directors was composed of 40% women. As I participated in the Coalition for Physician Accountability, which included all of the other major physician groups, we had the largest percentage of women physicians and board members than any of the other organizations. It is critical for the NBOME Board to reflect the “face” of osteopathic medicine which is on average about 50% women in every College of Osteopathic Medicine Class.


What do you think women bring to the table, particularly when it comes to leadership roles?

SB: Whether we are men or women, we all come to our leadership roles with a different style. I am certain that my role as Chair helped me develop my leadership skills in being able to provide difficult news clearly, directly, but gently.

JK: I believe that women bring empathy, strong work ethic and collaboration to osteopathic medicine. Of course, these are generalities that don’t apply to all women. When working with women leaders in osteopathic medicine I have seen them be solution focused and being inclusive of diversity of opinions. Most of the women leaders I have worked with have given over 100% to their positions.


What does the NBOME do well when it comes to promoting gender diversity in leadership, and what do you think we could do better?

SB: As I’ve said, the NBOME has been committed to reflecting the demographics of the osteopathic profession, even as I first became involved in 1989.  They do a good job. If there is a gap, I imagine it’s due more to a lack of awareness among the candidate pool than a lack of inclusivity on the NBOME’s part. Perhaps identifying a way to advertise or communicate opportunities could improve participation.

JK: NBOME intentionally recruited women for the Board of Directors during my tenure as Chair. In order to have the gender diversity, there needs to be an intentional approach by inviting and encouraging participation in all aspects of the organization by women. There needs to be an understanding and respect that women may have other roles and responsibilities during their careers that will change to enable them to participate at different times in the organization. NBOME could consider supporting a leadership track for women and men who are identified for future leadership roles within the organization.


How do you look back on your experience with the NBOME?

SB: Among all the leadership positions I’ve held in my career, I treasure this position the most for several reasons. The NBOME is made up primarily of volunteers with great affection for the osteopathic profession and the desire to give back.  Unlike many professional organizations, egos are left at the door.  Patient wellbeing and student fairness are always at the forefront in our decisions, from test development to the cost of exams, etc.  Working with colleagues across the entire spectrum of medical care for this organization is a true blessing.

JK: It was a true privilege for me to serve as an officer and Chair for the NBOME. Being involved with the NBOME and having the opportunity to be a leader in assessment for osteopathic medicine has been a true highlight of my career as an academic osteopathic physician. The mission of the NBOME to protect the public is noble and necessary for the public good and for all of us as patients.



Sheryl Bushman, DO currently works as Chief Medical Informatics Officer at Optimum Healthcare IT, and contributes to on our COMLEX-USA Level 2-PE Advisory committee. She served the NBOME’s Board Chair from 2005-2007


Janice Knebl, DO currently practices and teaches Geriatric Medicine in Fort Worth TX, in addition to chairing our COMLEX-USA Composite Examination Committee. She served as NBOME Board Chair from 2011 to 2013



See All

You may also like


COMLEX-USA  |  New Level 1 and Level 2-CE Exams Have Launched

We are pleased to announce the completed launch of all elements of the enhanced COMLEX-USA exam series under the new COMLEX-USA Enhanced Master Blueprint.  Level 1 successfully released this spring, followed by Level 2-CE’s in late summer. These exams have joined Level 3 and Level 2PE, which launched in 2018 and earlier this year, respectively.  New passing standards for COMLEX-USA Level 1 and 2 have also been implemented for the 2019-2020 test cycles.

This multi-stage release is the culmination of nearly 10 years of work in evidence-based design by experts and leaders from across the organization and the country who contributed in all areas to the creation and deployment of this state-of-the-art assessment.

The exams launched to heavy candidate volume with over 1,500 candidates completing each exam during the first weeks. To date, over 5,000 Level 1 examinations have been administered, with similar numbers for Level 2-CE.

These examinations also mark a move to Prometric’s new test driver, SURPASS, where NBOME already administers its Core Osteopathic Recognition Readiness Examination (CORRE), as well as the latest version of COMLEX-USA Level 3.  Since the move to SURPASS, some students have encountered performance problems during their administrations, including latency and examination restarts.  Prometric continues to investigate the cause and has made system upgrades in early June to address issues.  NBOME is currently offering online tutorials for Levels 1, 2-CE and 3 for candidates who would like to learn more about the new test interface being offered at Prometric Testing Centers.

Please visit the COMLEX-USA pages of our website to learn more.



COMAT  |  Foundational Biomedical Sciences (FBS) Exams Available this Academic Year

Since the inception of the COMAT Clinical Exams in 2011, osteopathic medical students have taken over 250,000 COMAT Clinical exams. As a result, we have seen dramatic improvement in COMLEX-USA Level 2-CE scores by osteopathic medical students. This is particularly important in the era of Single Accreditation for GME.

Celebrating 85 years of protecting the public through valid and reliable licensing exams, NBOME has spent the last 5 years developing an expanded COMAT portfolio to assist in DO student success.

COMAT Clinical exams initially focused on assessment of clinical education and knowledge typically found in year 3 and 4 COM curriculum. The success of this initial series of exams led the NBOME, in collaboration with its National Faculty, to expand its offerings and develop assessments for the Foundational Biomedical Science (FBS) curriculum which take place during year 1 and 2. After careful development and testing, the COMAT FBS Comprehensive (FBS-C) exam became available in December 2018. Since its inception, the 5-hour, 250-question COMAT FBS-C has successfully been utilized by many COMs across the country. This assessment has enabled both COM students and faculty to better understand the effectiveness of the school’s classroom curriculum and identify areas for student development.

Scheduled for release in January 2020, the suite of 14 FBS Targeted (FBS-T) exams further supports osteopathic medical student professional success. These exams are divided between 6 core science disciplines, including Anatomy and Pharmacology, as well as 8 body systems, including musculoskeletal and cardiovascular. Each 90 minute, 62 question COMAT FBS-T exam is designed to evaluate a student’s knowledge in a focused subject area. Timely score reports detail areas of strength and challenge, and will provide COM faculty and students insight to guide COMLEX-USA Level 1 preparation.

Should you have any questions about COMAT or the new FBS examinations, please visit the COMAT pages of our website.

Contributed by:  Michael Finley, DO  |  Senior VP for Assessment  |  NBOME 



CATALYST  |  Continuous Learning Platform

Wouldn’t it be nice if you could make learning new material easier?  And what if you could avoid taking another traditional multiple-choice exam to demonstrate what you’ve learned?

Inspired by research from leading cognitive psychologists, as well as by the success of the American Board of Anesthesiology’s MOCA-Minute, the NBOME began its research journey into developing its own continuous learning platform – CATALYST.

CATALYST is a formative assessment platform designed as an alternative to traditional physician competence and practice-relevant assessment. Based on the outcomes of several successful pilots conducted in 2017 and 2018, the NBOME has expanded its partnership with ITS to develop a more sophisticated platform that can be customized to meet various client assessment needs.

With the primary goal of eliciting user feedback on the newly designed platform, the CATALYST 2.0 Platinum Pilot was released on June 5.  Participants included osteopathic medical students, residents, NBOME National Faculty and NBOME staff. Learners were asked to answer 70 multiple-choice questions during a five-week period and were offered a choice of receiving 2 items a day, 14 items a week, or all 70 items at once. Following the completion of each item, participants were asked to gauge their confidence in answering the question and the question’s relevance to their specialty / field of study. Whether or not the question was answered correctly, the participant was provided with immediate feedback including the correct answer, a rationale for the answer, as well as references and links to additional learning resources.

Feedback has been very positive — 94% said the platform met their expectations and 91% found the system easy to navigate. What did participants like most about CATALYST?  One responded that it was “very easy to navigate, good questions.” Another “liked that the platform showed the learning objectives of each question, helped identify why the question was being asked, and identified what the learning goals were for each question.” And a National Faculty member liked that “it could be done on my own time across multiple platforms and devices.”

The cross-functional CATALYST team, led by Sandra Waters, MEM, VP for Collaborative Assessment & Initiatives, is preparing for the next CATALYST release in September which will include an enhanced dashboard with normative statistics, highlighting of item components, and streamlined registration. Preparation has also begun for the delivery of COMSAE Phase 2 on CATALYST, providing COMLEX-USA Level 2-CE candidates the opportunity for alternative learning through formative assessment.

Contributed by:  Dot Horber, PhD  |  Director for Continuous Professional Development  |  NBOME

In this section

Browne, M, Wojnakowski M, Horber DT.  Choosing Wisely: So Many Options for Assessment Administration. Which will Enhance Your Exam’s Validity and Fairness? Paper presented at the 2019 Innovations in Testing Conference, Orlando FL, March 2019.


Short Description

With advances in assessment, credentialing organizations are presented with myriad options to “enhance” test format and administration. Two organizations have been conducting research and pilot testing to explore some options alone and in combination – use of resources while testing, and, high stakes testing in remote proctored conditions.

Reference availability may increase an assessment’s fidelity to real life clinical situations, but it raises many implementation questions: Which references will be useful and what is the best way to make them available? What is the effect on test time needs, outcomes, and validity?

Remote proctoring is attractive to candidates as a convenience and can offer some cost savings. In reality though, just how easy is it to test from home? What are the security implications? Copyright treats remote proctored tests differently; how can this be addressed?

The presenters will discuss obstacles encountered, comparison of outcomes, and best practices found.


Full Description

With advances in assessment, credentialing organizations are presented with myriad options to “enhance” test format and administration. Two organizations have been conducting research and beta testing to explore some options alone and in combination – use of resources while testing, and, high stakes testing in remote proctored conditions.


As certification organizations move toward nontraditional assessments, provision of reference resources during assessment is one of many areas of uncertainty. Although reference material availability likely increases an assessment’s fidelity to real life applicable clinical situations, it raises many implementation questions as well as concerns about test outcomes and validity.


Remote proctoring, long a hot topic, has rarely been contrasted with in-person proctoring in a high stakes examination. The differences that materialized in candidate acceptance, test administration and outcomes can inform much constructive discussion.


One organization is researching options for continued professional certification for its 50,000-plus certificants. A 1,500-participant research study incorporating open-book features and different proctoring conditions was completed in October 2018. The research divided the participants into six different experimental conditions – in-person proctored vs remote proctored, no resources, e-resources, and hard copy resources.  Presenters will discuss development of the research design as well as the research outcomes.


Presenters from the second organization will discuss aspects of the development of an innovative item format that focuses on competency domains other than clinician knowledge recall. This incorporates the use of online resources to locate clinical diagnostic and treatment information to answer questions. The item format contains a clinical case scenario with associated multiple-choice items that would require most examinees to access online resources in order to answer the questions. In current day-to-day practice. Presenters will discuss the item development process and relate quantitative and qualitative data obtained from the Proof-of-Concept study. Lessons learned from this study and planned next steps will provide insights to organizations seeking more authentic modes of assessment of clinical behavior and decision-making.


NCME Paper 2019

The Effects of Test Familiarity on Person-Fit and Aberrant Behavior

Hotaka Maeda, Ph.D. & Xiaolin Wang, Ph.D.


Abstract (50 words)

The person-fit to the Rasch model was evaluated for examinees taking multiple subject tests with a similar structure. The evaluation considered which test in the sequence (i.e., first, second) was taken. Compared to an examinee’s first test, person-fit improved for later tests. Test score reliability may improve with test familiarity.



Aberrant behaviors are unusual test-taking behaviors that introduce noise to test data. They introduce nuisance constructs that are not intended to be measured and thus threaten measurement validity. One source of aberrant behavior is unfamiliarity with tests (Meijer & Sijtsma, 2001; Rupp, 2013). Examinees who take a new and unfamiliar test are likely to struggle to understand the test structure, gauge how much time they have for each item, navigate through a computer-based test, and handle their nerves. In contrast, examinees who are familiar with the test structure are likely to be less stressed, know how to prepare, and be able to complete the test efficiently. Compared to first-time takers’ results, scores for examinees who are familiar with the test structure may be less affected by the nuisance construct of test unfamiliarity and be more representative of their underlying ability. To the authors’ knowledge, this speculation has not been investigated and reported in the literature. Therefore, the purpose of this study is to examine the effects of test familiarity on person-fit and aberrant behavior using observed data.



The instrument used in this study is a comprehensive medical achievement examination composed of eight clinical subject tests. Medical students typically take the test at the end of their clinical rotation in a given clinical subject. All clinical subject tests are structured identically:

  • They are administered through the same platform.
  • Item stems are worded similarly as they all target commonly encountered patient scenarios.
  • All items in all tests are multiple-choice items with only one best answer.


Many examinees take all eight clinical subjects, but they do not take them in the same order. They can also choose to retake any clinical subject test. Therefore, the context of the instrument used in this study can be considered a quasi-experimental setting for assessing the effects of test familiarity on person-fit and aberrant behavior, where test familiarity can be defined by the number of clinical subject tests (including retakes) a candidate has taken.

Response data in all clinical subjects from July 2017 to June 2018 were used. Exploratory factor analysis with no rotation was conducted for each subject separately in order to identify high-quality items. Items were removed from the data if the factor loadings on the first dimension were less than 0.1. Then, the data were modeled using the Rasch model. For each subject, test forms were equated through concurrent calibration. Ability was estimated with maximum likelihood, which was standardized as N(0,1) and bound between [-5, 5] so that the values could be compared across subjects.

Aberrant behavior was assessed using the lz* person-fit statistic (Snijders, 2001). The lz* is asymptotically distributed as N(0,1), where positive values represent good person-fit, and negative values represent poor fit. If examinees respond to the items in a reasonable manner (e.g., not aberrant because of the familiarity of tests), lz* should be a high value, which shows that their responses fit well to the model. The lz* is uncorrelated with ability when aberrant behavior is not present. One of the typical cutoffs for determining poor person-fit is -1.645, which is equivalent to the one-tailed .05 alpha level.

The degree of person-fit (i.e., lz*) was regressed on the sequence of tests using two separate two-level random intercept models. As examinees took multiple tests, the tests were modeled as nested within examinees. Model 1 included three exam-level predictors: 1) examinee age in years at the time of the exam, 2) standardized test score, and 3) whether the subject being taken is a retake. The only predictor at the examinee-level was the number of times the person had ever retaken any clinical subject test (0, 1, 2, and >2). The model could be written as:


Model 1: lz* ~ age + test.score + subject.retake + total.retake


Model 2 included all the predictors in Model 1 in addition to the test sequence as a categorical variable from 1 to 11 (i.e., the order in which the examinees took the test, such as first test, second test, etc.).


Model 2: lz* ~ age + test.score + subject.retake + total.retake + test.sequence


The test sequence for some students did not start with “first” if they had taken the tests prior to July 2017. The test sequence can extend longer for students who retake some clinical subject tests.

Residual plots were used to confirm that the residuals were approximately normally distributed with the same mean and standard deviation at every fitted value. Because Model 1 was nested within Model 2, they were compared using a likelihood-ratio test.



For the purpose of this specific study, 1,422 out of 5,594 items were removed from analysis, many of which were pretest items. All subjects achieved unidimensionality after the removal of such items. In addition, response data from 55 tests were removed because of an abnormally high test sequence due to retakes (12 or more). The final sample size across all test subjects was 4,172 items on 42,903 test administrations given to 10,135 examinees (see Table 1). Each test contained an average of 96.7 items (SD = 9.3). A majority of examinees had no history of retaking any clinical subject test (68.4%). Only 6.7% of the tests were retakes.


Table 1. Number of Exams by Sequence and Clinical Subject

Test Sequence Clinical Subject
A B C D E F G H Total
1 71 672 568 394 2656 585 468 507 5,921
2 138 1,261 881 809 476 730 646 678 5,619
3 172 705 744 824 577 769 767 884 5,442
4 192 700 760 825 473 744 753 738 5,185
5 198 697 660 721 642 803 681 774 5,176
6 231 598 737 705 924 698 676 667 5,236
7 527 590 683 617 574 615 629 689 4,924
8 1181 352 334 288 575 262 287 353 3,632
9 207 124 175 90 109 99 90 125 1,019
10 228 51 64 26 75 37 41 35 557
11 75 4 7 8 80 8 6 4 192
Total 3,220 5,754 5,613 5,307 7,161 5,350 5,044 5,454 42,903

Note. Many examinees take all eight clinical subjects, but they do not take them in the same order. Although there are only eight clinical subjects, the test sequence can extend beyond eight because of retakes.


Mean lz* was 0.04 (SD = 1.09), while mean standardized test scores was 0.02 (SD = 1.14). Mean SE of the standardized test scores was 0.51 (SD = 0.07). Mean standardized test scores for those who had a history of retaking any clinical subjects test was lower (M = -0.36, SD = 1.13) than those who did not (M = 0.25, SD = 1.08). The percent of the test records exhibiting poor person-fit (i.e., lz* < 1.645) was 6.7%. Standardized test scores were positively correlated with lz* (r = .23).

A likelihood-ratio test showed that the addition of the test sequence predictor significantly improved the model fit, χ2(10)=75.05, p<.001. Controlling for examinee age, total historical test retake count, whether the subject being taken is a retake, and standardized test score, the student person-fit was the poorest for the first test compared to all later tests (p < .05). The coefficients from Model 2 are shown in Table 2. Compared to the first test, person-fit improved for the second exam by 0.07, and on the 11th test by 0.27.


Table 2. Model 2 Coefficients

Coef SE df t p
(Intercept) 0.41 0.05 32,754 8.65 <.001
Examinee-level predictors
Retake total = 0 (Reference)
Retake total = 1 -0.04 0.02 10,131 -2.62 .009
Retake total = 2 -0.11 0.02 10,131 -4.61 <.001
Retake total > 2 -0.19 0.03 10,131 -6.73 <.001
Test-level predictors
Standardized score 0.19 0.01 32,754 38.17 <.001
Examinee age in years -0.02 0.00 32,754 -9.30 <.001
Retaking the clinical subject 0.10 0.02 32,754 4.44 <.001
Test sequence = 1 (Reference)
Test sequence = 2 0.07 0.02 32,754 3.63 <.001
Test sequence = 3 0.08 0.02 32,754 4.27 <.001
Test sequence = 4 0.10 0.02 32,754 5.03 <.001
Test sequence = 5 0.13 0.02 32,754 6.62 <.001
Test sequence = 6 0.13 0.02 32,754 6.57 <.001
Test sequence = 7 0.10 0.02 32,754 4.66 <.001
Test sequence = 8 0.13 0.02 32,754 5.86 <.001
Test sequence = 9 0.10 0.04 32,754 2.79 .005
Test sequence = 10 0.20 0.05 32,754 4.12 <.001
Test sequence = 11 0.27 0.08 32,754 3.27 .001

Note. Person-fit was modeled using a two-level random-intercept model.


Model 2 also showed that those who had a history of retaking any clinical subject test tended to have lower person-fit than those who did not (p <.05). However, retaking the same clinical subject test was associated with an increase in person-fit by 0.10 (p <.001).



This study shows that person-fit to the Rasch model improves as examinees gain experience in taking a series of tests with a similar structure. Improvements in person-fit were observed beyond the first and second tests. Test familiarity increased lz* by 0.1 or more. For reference, an increase in lz* from 0 to 0.1 is equivalent to an increase in person-fit by 3.98 percentiles. The findings indicate that the reliability of the test scores may improve with test-taking experience, and they show the importance of examinee familiarity with the test structure. The improvement in person-fit by increased test familiarity supports the provision of practice materials in order to minimize the negative impacts from test unfamiliarity and to promote measurement validity.

When interpreting the data, retakes of the same clinical subject exams needed to be considered. The option to retake any test allowed the test sequence to go beyond the number of available clinical subjects (i.e., eight). Clearly, a person who has taken the same test multiple times (despite taking a different form every time) should be more familiar with the test than the first-time takers. The examinees who have retaken any of the clinical subject exams tend to be lower achievers and have lower person-fit compared with non-retakers. However, their person-fit improved upon retaking the same clinical subject test. Also, results suggest poor person-fit occurred due to spuriously low aberrant behavior (i.e., poor performance) such as running out of time, more often than spuriously high-scoring behavior such as item pre-knowledge. This led many of the poor performers to retake the test. However, regardless of the test-retaking behavior, familiarity of the test structure led to increases in person-fit.

The study is limited in that we did not directly investigate whether improvement in person-fit is in fact associated with an increase in the accuracy of the standardized test scores. This is rather difficult to show empirically, but it should be pursued in the future. Further, a quasi-experimental design was used, where some factors were uncontrolled, including allowing examinees to retake any test at their own will. These test-retaking patterns were not random as they were correlated with important variables such as the standardized test scores. The study should also be replicated using other psychometric models and test data.



Meijer, R., & Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25, 107-135.

Rupp, A. A. (2013). A systematic review of the methodology for person fit research in Item Response Theory: Lessons about generalizability of inferences from the design of simulation studies. Psychological Test and Assessment Modeling, 55, 3-38.

Snijders, T. (2001). Asymptotic null distribution of person-fit statistics with estimated person parameter. Psycho

New York Colleges of Osteopathic Medicine Educational Consortium (NYCOMEC)

This presentation to osteopathic medicine residency directors focused on preparing for their Clinical Learning Environment Review (CLER), an ACGME program instituted as part of the next accreditation system. The goal of CLER is to ensure residency programs train residents to ensure patient safety. The presentation focused on what is required to ensure patient safety, i.e. “learner safety.” It presented how to debrief residents using “good judgment” (a focus on performance gaps) and “empathic inquiry” (debriefing that develops self-reflection and self-correction). The talk provided examples of effective and ineffective feedback and debriefing approaches.

Parshall C, Julian E, Parikh S, Horber DT.  Using Nudges for More Effective Exam Programs. Paper presented at the 2019 Innovations in Testing Conference, Orlando FL, March 2019.

Short description:

Nudges are small, deliberate tactics we can use to help our test-takers (and our SMEs) do the things they want to do. While our testing programs have many points that can derail candidates, through small and subtle changes we can help them persist through the life cycle of application, testing (and retesting), and ongoing certification. For example, framing tactics in messaging can effectively decrease the number of test-takers who fail to show. Nudges can also be used with SMEs to increase JTA survey response rates and committee volunteer numbers. Join us for a panel discussion with researchers and practitioners using nudges in testing.


Full description:

Behavioral nudges have been used forever to help people remember to do things, or follow through on things they started. New research has identified the strategies that are most effective, as well as the research tools for increasing their success in a specific environment. As a result, the use of nudges is moving from ad hoc to intentional and systematic. Educators, corporate offices, and governmental institutions are formally incorporating nudges into their interactions with the public and their staff, and testing programs can use them to support examinees, subject-matter experts, staff, and employers in doing what they already want to do.

The underlying goal is to influence, or “nudge,” people in positive ways that are in their own best interest, as defined by themselves. This presentation will discuss ways that a variety of testing programs are already using nudges and will share the evidence of their effectiveness.

This session will have a panel that includes researchers and practitioners effectively using nudge tactics in the field of testing. They will share real-world successful (and unsuccessful) examples of nudges in testing.

Presentations will include:

  • an overview of nudges: what they are, the evidence for their effectiveness, and a simple research plan for implementing nudges effectively.
  • a discussion of common areas in testing programs where people have agreed to do things, but often need help carrying them out: e.g., examinees would benefit from nudges to meet registration deadlines, study, stay honest, show up for the test on time with appropriate accouterments; SME’s would benefit from nudges to volunteer, write items, review items.
  • a presentation on before-and-after data on how timely phone calls decreased candidates “no-show” for a medical licensure performance exam; additional nudge interventions from the program’s in-development continuous assessment will be included.
  • a case study of nudging applied in a high school equivalency program, with specific behavioral techniques and overall results.



Ariely, D. (2008). Predictably Irrational: The Hidden Forces That Shape Our Decisions. New York: HarperCollins.

Kahneman, D. (2011). Thinking, Fast and Slow. New York: Farrar, Straus, and Giroux.

Thaler, R.H., & Sunstein, C.R. (2008). Nudge: Improving Decisions About Health, Wealth, and Happiness. New Haven, CT: Yale University Press.




Kimberly M. Hudson, PhD, National Board of Osteopathic Medical Examiners

Yue Yin, PhD, University of Illinois at Chicago

Tsung-Hsun Tsai, PhD, National Board of Osteopathic Medical Examiners

Grant Number/ Funding Information

Not applicable

Corresponding Author

Kimberly Hudson, 8765 West Higgins Road, Suite 200, Chicago, Illinois 60631; 773-714-0622; Kimberly.shay86@gmail.com

Key Words

Equating, Automated Test Assembly, Optimal Test Assembly, IRT, Rasch Model, CINEG


As early as the 1960s, testing organizations began implementing Automated Test Assembly (ATA) to simplify the laborious process of manually assembling test forms and to enhance the psychometric properties of the examinations (Wightman, 1998; van der Linden, 2005). But it is unclear what impact transitioning to ATA has on equating outcomes. The purpose of this research study was to evaluate outcomes from different IRT scale linking and equating methods when a testing organization transitioned from manual test assembly to ATA.

After crossing each scale linking procedure with each equating method, I calculated error and bias indices (e.g., RMSD, MAD, MSD) and evaluated the decision consistency of the equating outcomes.

The results showed that the mean/mean scale linking procedure paired with the IRT preequating method produced the lowest bias and error, and highest level of decision consistency.

The results of this study support the importance of aligning psychometric and test development procedures. The findings of this study suggest that the equating outcomes were related to the similarity in statistical test specifications. ATA resulted in more parallel test forms with better psychometric properties than forms assembled manually. Therefore the modifications to assembly practices warrant the reconsideration of a new base form for scaling and standard setting.




In high-stakes medical licensure testing programs, test developers and psychometricians work together to develop multiple test forms that can be administered simultaneously to examinees to enhance examination security. Although the volume of forms may differ between testing programs, it is crucial that all test forms are built according to the same test specifications (von Davier, 2010). Furthermore, scores on the test forms must be interchangeable and candidates should perceive no difference between the test forms administered (Kolen & Brennan, 2014). The test development processes and psychometric procedures are inherently connected and both must be considered when developing multiple test forms.

Traditionally, test developers have manually assembled multiple test forms according to a set of content requirements. Test developers typically evaluate statistical criteria such as mean proportion of correct responses (p-value) or mean point-biserial correlation upon completion and make adjustments to confirm that statistical specifications are met. Manual test assembly (MTA) is a time-intensive process, typically requiring the attention and work of multiple test developers. However with the widespread use of computers, testing organizations can improve the laborious manual process by developing and employing computer programs to automatically assemble tests. If staff members possess technical computer programming skills, they might create computer programs that can assemble multiple test forms simultaneously by balancing the content and statistical constraints.

When assembling tests manually, test developers use a variety of informational inputs, or constraints, to create multiple forms of an assessment that are balanced in terms of content, difficulty of items, item formats, contextual information of items (e.g., the patient’s life stage), item duration, word count, and exposure rate. Test developers first compile an item pool, which contains a selection of items that meet some basic requirements for inclusion on a test. Scorable items function as operational or anchor items and often have known item parameters based on a prior administration. Test developers iteratively select a group of items that meet the minimum proportions of each domain as specified by the test blueprint and evaluate the range of item statistics or average item statistics, such as p-values and point-biserial correlations. The number of parallel test forms and the number of constraints undoubtedly impacts the complexity of manually assembling forms. Moreover, many testing organizations implement this resource-intensive process across numerous testing programs on an annual or semi-annual basis.

Automated Test Assembly (ATA) is an efficient alternative to this laborious process with unique challenges (Wightman, 1998). Unlike MTA, ATA programs utilize the test information, the summation of item information across the ability continuum, in the creation of multiple parallel test forms. Thus, ATA improves the manual procedure by not only saving time and resources, but also enhancing the psychometric quality of balanced forms according to a predetermined set of constraints and maximization of the specified objective function. ATA may improve reliability across examination forms due to the standardization of the test development process. Therefore the impact of ATA is not just a question of “Can the computer do it,” but rather “Can the computer do it better?”

In medical licensure examinations there is a critical need for score comparability across test forms to not only ensure that scores are an accurate, reliable representation of examinee ability, but also to make pass/fail distinctions based on the scores. Earning a passing score on a medical licensure examination allows examinees to enter into supervised medical practice. Therefore psychometricians work to maintain decision consistency, regardless of the test assembly method and the form administered to examinees. Decision consistency refers to the agreement of an examinee’s pass/fail decisions on two (or more) independent administrations of unique forms and decision accuracy refers to the agreement between an examinee’s pass/fail decision and whether the same decisions made based on an examinees’ true ability (Livingston & Lewis, 1995). These two indices are necessary to evaluate in high-stakes medical licensure testing. In this research, I compare the decision consistency of equated results after implementing ATA.

The results of this research provide a psychometric framework to evaluate results from different equating methods upon the implementation of ATA. When testing organizations implement new test development processes, it is critical to examine the impact on examinee scores (AERA, APA, & NCME, 2014). Testing organizations monitor and evaluate scores and decision consistency of scores on examinations that ultimately license examinees to practice medicine in supervised or unsupervised settings. Neglecting to examine this may inadvertently lead to passing unqualified physicians, or failing qualified physicians.

In ATA, psychometricians and test developers often define linear and/or non-linear constraints in order to maximize a specific objective function, typically the test information function (TIF), at a given score point on the true-ability continuum (van der Linden, 2005). In a high-stakes licensure examination, the minimum passing standard (or cut-score) is commonly used for optimization because it maximizes test information near the cut-score and minimizes the standard error of measurement (SEM) at the cut-score. This leads to increased reliability of scores closest to the cut-score and better accuracy of pass/fail distinctions. Therefore, ATA is designed to enhance the psychometric qualities based on prior item information (i.e., higher reliability coefficients, and lower standard error of measurement near the cut-score), and the efficiency of assembling test forms. However, research has not yet addressed the impact of transitioning from MTA to ATA on results from equating methods. In this study, I investigate the differences in equated results between MTA and ATA forms.

Most ATA processes use a framework of Item Response Theory (IRT) to construct forms with computer programs integrating item-level information according to a set of predetermined constraints. The use of IRT typically goes hand-in-hand with the psychometric framework utilized by the testing program. In IRT, items have a set of unique characteristics; some items are more informative than others at different ability levels. Psychometricians investigate the individual contribution of an item to a test by reviewing the item information function (IIF). The TIF is the summation of IIFs across the ability continuum. The TIF in the ATA represents the characteristics and composition of all items on each test form. Moreover, in the context of medical licensure examinations, using the minimum passing standard as the value for optimization ensures that scores are precise ability estimates for minimally qualified examinees. Thus, when the TIF is optimized at the cut-score, it ultimately reduces the probability of Type I error (unqualified examinees passing the examination). Furthermore, Hambleton, Swaminathan, and Rogers (1991) suggest that the test characteristic curve (TCC) creates the foundation for establishing the equality of multiple test forms, which is certainly the case when optimizing the TIF. The TIF provides aggregate information from each item on the examination, whereas the TCC shows the probability of an expected raw score with a given ability level, . If we wish to create parallel test forms, then the TCC provides evidence that a given ability level relates to similar expected scores for two parallel forms of the same test. Furthermore, the use of content and statistical constraints in ATA computer programs provides evidence that all test forms are balanced in terms of statistical specifications.

Once parallel forms are assembled, reviewed, published and administered, the results must be analyzed and equated. Equating refers to the use of statistical methods to ensure that scores attained from different test forms can be used interchangeably. Equating can be conducted through a variety of designs, approaches and methods (Kolen & Brennan, 2014). Although there are key differences between IRT and the Rasch model, this research will focus on the applicability of IRT equating methods to a testing program that utilizes the Rasch model as a psychometric framework. Within IRT equating methods, both preequating and postequating methods are widely implemented in K-12 educational settings to ensure scores can be used interchangeably (Tong, Wu, & Xu, 2008). Psychometricians may use IRT to preequate results prior to the start of examination administration, which assuages the tight turnaround time between examination administration and score release. Alternatively, postequating methods use response data from complete current examination administrations (Kolen & Brennan, 2014).

In IRT preequating methods, item parameters are linked from prior calibration(s) to the base form of an examination. For the purpose of this research, item difficulties will be the only item parameter used, which is in alignment with the testing program’s psychometric framework (the Rasch model). The base form (denoted as Form Y) is the form in which the cut-score was established. In order to implement preequating methods, item difficulties for scorable items must be estimated prior to examination administration. Prior to ATA, scorable item difficulties must be known to calculate and maximize the TIF. The alignment of previously calibrated item statistics that are used both for assembling forms using ATA and for preequating may support the applicability of this equating method.

Measurement Models and ATA

IRT allows test developers to “design tests to different sets of specifications and delegate their actual assembly to computer algorithms,” (van der Linden, 2005, p.11). By setting constraints for computerized test assembly, including blueprint domain representation or reasonable ranges for item statistics, test developers can create multiple forms of examination that are parallel in difficulty. ATA can incorporate item details regardless of the psychometric paradigm used to calibrate or score examinees and can be applied to polytomously or dichotomously scored examination. As discussed previously, this study uses data previously calibrated using the Rasch model.

While CTT, IRT, and Rasch approaches to ATA can utilize population dependent item statistics (i.e., p-values and discrimination indices) as constraints, in CTT there is no equivalent metric to the TIF. In order to construct parallel test forms in ATA, Armstrong, Jones and Wang (1994) maximized score reliability through a network-flow model. The authors stated that it was advantageous to use the CTT approach because it was computationally less expensive and produced comparable results in relation to the IRT approach to ATA. When this research was published, computational power was indeed a challenge; however, advances in computer memory and technology are much greater now, so the cited advantage does not hold the test of time. As such, IRT or Rasch approaches to ATA are more supported in the literature and are the focus of this study.

Prior to beginning ATA, test assemblers must calibrate response data to estimate item parameters from the sample population. Psychometricians often examine the goodness-of-fit of the data to determine the best IRT model (i.e., 1-PL, 2-PL) or confirm that the data fit the Rasch model. Once the examination is administered, psychometricians anchor item parameters based on prior calibrations to estimate examinee ability (van der Linden, 2005). In this study, I calibrate data using the Rasch model and will provide some evidence supporting the appropriateness of the model.

In ATA, test assemblers often optimize the TIF and evaluate the similarity of the forms by comparing the TIFs. However, even well-matched TIFs do not necessarily yield equitable score distributions (van der Linden, 2005). Thus, psychometricians must also continuously evaluate and monitor the score distributions once the examination forms are administered. The main question of this study is which IRT equating method (IRT observed score, IRT true score, or IRT preequating) yields the most comparable scores and decision consistencies when transitioning from MTA to ATA. In the following section, I provide a foundation of linking, equating, and scale linking as it pertains to this study.


Equating is the special case of linking in which psychometricians transform sets of scores from different assessment forms onto the same scale. By definition, equating methods are only applied to assessment forms that have the same psychometric and statistical properties and test specifications. The primary goal of equating is to allow scores to be used interchangeably, regardless of the form that an examinee was administered (Holland & Dorans, 2006; Kolen & Brennan, 2014).

Assessment programs can employ a variety of equating designs and methods, each design with unique characteristics and assumptions. Assessment programs often administer examinations within and across years. For the purpose of this section, I notate an original form of an examination as Y and a new form of an examination as X, with the understanding that assessment programs may administer multiple new forms ( ) or multiple original forms ( ). CINEG design are commonly used and require previously administered items from original forms of an examination to be included on new forms by a set of common or anchor items. The CINEG design is considered a more secure design than the random groups design because only a set of common items are exposed from an original form, rather than exposing an entire original form.

The CINEG design not only accounts for the difference in form difficulty, but also accounts for the difference in the population of test-takers. The statistical role of the common items is to control for differences in the populations, therefore removing bias from the equating function. In order to implement the common items design, the common items must meet several requirements (Dorans et al., 2010). First, the common items must follow the same content and statistical specifications as the entire full-length test. Second, there should be a strong positive correlation between scores on the full-length test form and scores on the common items because the common items follow the same specifications as the full-length test. Thirdly, measurement and administration conditions for the common items must be similar across new and original forms. Lastly, prior research recommends the use of common item sets include at least 20% of the full-length test, or consist of at least 30 items (Angoff, 1971; Kolen & Brennan, 2014). Satisfying these requirements ultimately ensures that the reported scores and the decisions based on the reported scores are accurate and reliable. The testing program used for this study meets the conditions described above.

IRT equating methods can be applied to data calibrated using the Rasch model and are the focus of this study. In this section, IRT equating methods are discussed in detail; however, first psychometricians must use scale linking procedures to examine the relationship between newly estimated item parameters and original estimations of item parameters from two independent calibrations. Due to the assumption of item invariance, if item parameters are known, no equating or scale linking is necessary and IRT preequating methods can be implemented prior to test administration (Hambleton et al., 1991). However, in practice it is important to implement scale linking procedures because there are often differences in item parameter estimates (Stocking, 1991).

Scale linking is the process by which independently calibrated item difficulties are linked onto a common scale. Several methods can be used to calculate scaling constants in order to place the item difficulties from form X on the same scale as Y (Hambleton et al., 1991). The mean/mean, mean/sigma, and TCC methods are discussed in their application to this study. Prior research supports the performance of TCC methods over other methods (i.e., mean/mean or mean/sigma) for scale linking due to the stability of the results and the precision, even when item parameters had modest standard errors (Kolen & Brennan, 2014; Li et al., 2012). Other research investigated the adequacy of different scale linking procedures within the Rasch model.

The mean/sigma method calculates scaling constants A and B based on the mean and standard deviation of the difficulty parameters of the common items on form X. There are two main TCC scale linking procedures, which are iterative processes that utilize item parameter estimates; the focus of the current student is on the Stocking and Lord (1983) procedure. The scale indeterminacy property of IRT is used in this method, such that an examinee with a given ability will have the same probability of answering an item correctly regardless of the scale used to report scores. The Stocking and Lord TCC procedure calculates the probability of correctly answering an item on the original scale ( ) and the new scale ( ) for each common item ( ) by taking the difference in examinee ability into consideration. Equation 10 represents the difference in TCCs ( ) between common items administered on form Y and form X, respectively. Then an iterative process solves for A and B by minimizing  across all examinees.



Once item parameters are on the same scale, IRT equating methods are employed. IRT true score equating is the most commonly used IRT equating method. In IRT true score equating, true scores ( ) are represented as the number-correct score for examinee ( ) with given ability ( ; Kolen & Brennan, 2014). Additionally, true score equating assumes that there are no omitted responses (von Davier & Wilson, 2007). In a simplistic example, psychometricians first identify a true score on form X, then estimate the corresponding ability level is determined (see equation 12). Then, the true score on form Y ( ) is determined by using the corresponding ability level (see equation 13). Therefore, the equivalent score is the inverse of the ability distribution. This process is iterative, which typically involves the Newton-Raphson Method (Kolen & Brennan, 2014; Han et al., 1997).

and                                                          (12)


Unlike IRT true score equating methods, the IRT observed score equating method depends on the distribution of examinee abilities. The IRT observed score equating method is similar to equipercentile equating methods without the application of additional smoothing techniques, as previously discussed. It requires specifying the distributional characteristics of examinees prior to equating, using prior distributions (Kolen & Brennan, 2014).

All of the IRT equating methods previously discussed require data from the current test administration cycle. However, the IRT preequating method can be used when items are pretested prior to operational use. Once items are on the same scale, psychometricians generate raw-to-scale conversion tables prior to form administration, which ultimately decreases the workload for score release (Kolen & Brennan, 2014). Many testing organizations utilize IRT preequating in order to shorten the window for score release after examination administration.  Testing organizations may also prefer IRT preequating methods due to their flexibility when equating scores for computer-based examinations that are administered intermittingly over a long testing cycle.

Researchers have compared the results among equating designs, methodologies and procedures; yet no researchers have compared scale linking or equating outcomes among ATA and MTA forms. In the current study, RMSD, MSD and MAD were calculated to examine the error and bias associated with scores. Researchers have commonly used these indices to evaluate the comparability of equating methods (Antal, Proctor, & Melican, 2014; Gao, He, & Ruan, 2012; Kolen & Harris, 1990).

Decision Consistency

In the context of testing programs that aim to categorize examinees into one or more groups based on their scores, such as medical licensure examinations, classification accuracy is a measurement of whether examinees were accurately classified based on their true ability (Lee, 2010).

Research Questions

The goal of this study is to compare the equating results when a testing organization moved from MTA to ATA. The research questions address the comparability of outcomes from three different methodological approaches to equating, after combining three IRT equating methods with three scale linking procedures.

  1. Which method of IRT equating (e.g., IRT observed score, IRT true score, or IRT preequating methods) minimizes error and bias associated between MTA and ATA developed forms?
  2. Which method of IRT equating (e.g., IRT observed score, IRT true score, or IRT preequating methods) yields the highest expected decision consistency of pass/fail distinctions between MTA and ATA developed forms?



In this study I used two years of response data from a large-scale medical licensure examination. From the 36 Y forms, I selected four forms (denoted ). There was item overlap among the 36 Y forms, which made it possible to concurrently calibrate data from the 36 Y forms simultaneously using the Rasch model.

First, I aggregated key information for each form, where pretest items were not embedded within each Y form. The pretest design used a total of 12 unique pretest blocks, each consisting of 50 items with overlap. The test administration vendor randomly assigned pretest blocks to examinees. Therefore, pretest items needed to be reviewed and selected for the form by form calibration for CINEG design. Figure 5 shows the design of an intact Y form (denoted form A) of operational items and plausible assignment of six pretest blocks. In Figure 5, Form A consists only of operational items, the test vendor randomly assigned a pretest blocks from group A (PTA) and a pretest blocks from group B (PTB). The diagram is a simplified depiction of the true design, which can ultimately yield more than 5,100 different combinations. Therefore, I employed a threshold of 30 responses to determine which pretest items had sufficient exposure for inclusion in the form by form calibration. Despite anchoring the item difficulties, at least 30 exposures ensured there was sufficient data to investigate data-model fit.

The concurrent calibration of the operational and pretest items on Y forms resulted in item difficulties on the same scale of measurement. Additionally, I selected four X forms (denoted ). I used three criteria to select the eight forms for this study: (a) X forms with the highest volume of administrations after the first several weeks following the examination launch, (b) X forms and Y forms with at least 20 percent overlap or at least 30 common items for scale linking purposes, and (c) the common item set was representative of the test blueprint (Angoff, 1971; Kolen & Brennan, 2014). The data design is shown in Figure 6. The common set of items on  and is denoted as , the common set of items on  and  is denoted as , the common set of items on  and  is denoted as , and the common set of items on  and is denoted as .

Approximately 7,600 examinees took one of the Y forms in year 1 and 4,300 first-time examinees took one of the X forms in the first testing window of year 2. After selecting the four forms, as previously described, I used data from the approximately 1,300 examinees who were administered  or  and the approximately 1,200 examinees who were administered  or . Table IV displays a summary of the data selected for this research study.

Response data from year 1 on all 36 Y forms were concurrently calibrated using WINSTEPS® (Linacre, 2017). The estimated item difficulties were then used as anchors for each separate form calibration of . Y is considered the base form of the examination and therefore no equating on original forms was conducted.

Data Analyses

All data management and analyses were conducted in RStudio (2016), unless otherwise specified. The criterion of  was used to examine the statistical significance of tests, unless otherwise specified.

1.                  Research Question 2: Equating Methods and Error

                        Which method of IRT equating (e.g., IRT observed score, IRT true score, or IRT preequating methods) minimizes error and bias associated between MTA and ATA developed forms?

            I employed three scale linking approaches (mean/mean, mean/sigma, and Stocking-Lord TCC) and three equating methods (IRT observed score, the IRT true score, and the IRT preequating). I utilized the PIE computer programming to implement IRT observed score and IRT true score equating methods (Hanson et al., 2004b). To assess the equating results, I compared the root mean squared difference (RMSD), mean absolute difference (MAD), and mean signed difference (MSD) on X’ to Y. I then evaluated which method minimizes bias by identifying RMSD values close to 0 and evaluated which method minimizes error by identifying MSD and MAD close to 0. Higher indices indicate an accumulation of error and are not preferred. Findings from prior research show that IRT preequating methods often have higher levels of error associated with the examinee scores. However, due to the alignment of using precalibrated item difficulties for both ATA and preequating methods, I expect that the design of ATA may have an impact on the equated results.

2.      Research Question 3: Equating Methods and Passing Rates

                        Which method of IRT equating (e.g., IRT observed score, IRT true score, or IRT preequating methods) yields the highest expected decision consistency of pass/fail distinctions between MTA and ATA developed forms?

Using the outcomes from research question 1, I estimated decision consistency indices using Huynh’s methodology (1990), which uses the probability density function, item curve functions (ICFs) and relative frequencies of a single population to estimate to common decision consistency indices: a raw agreement index,  and kappa,  (see equations 18 and 19). The raw agreement index,  is calculated using the cumulative distribution function of test scores, and relative frequencies of test scores. Kappa is calculated as the difference between the raw agreement index, and , the expected proportion of consistent decisions if there is no relationship between test scores. Kappa indicates the decision consistency beyond what is expected by chance (Subkoviak, 1985).



,                                                                                                       (19)

,                                                                                                     (20)



Where represents the ability level at a given raw score, ;

represents the difference in cumulative distribution functions of the raw cut-score,  at ability level, ;

represents the relative frequency distribution at  and

represents the number of classifications.


Research Question 2

To evaluate the adequacy of the results, I calculated the RMSD, MSD, and MAD. RMSD is a measure of bias, and MSD and MAD are measures of random error. Values closer to 0 indicate no raw score point differences between MTA and ATA forms. Overall, there were large differences in the amount of bias and error associated across forms and equating methods, therefore RMSD, MSD and MAD are presented separately for each form (see Table XV). Across all forms the equating and scale linking method with the least amount of error and bias was the mean/mean preequating method.

Table XV


Observed Score True Score Preequating
RMSD 19.35 20.46 21.85 19.51 20.55 22.10 8.02 8.41 8.278
MSD -18.83 -20.27 -21.38 -18.97 -20.35 -21.62 -7.71 -8.05 -7.96
MAD 18.83 20.27 21.38 18.97 20.35 21.62 7.71 8.05 7.96
RMSD 10.37 5.47 9.40 10.35 5.70 9.40 4.74 5.24 4.91
MSD -10.31 -4.18 -9.35 -10.29 -4.29 -9.34 -4.31 -4.75 -4.46
MAD 10.31 4.34 9.35 10.29 4.49 9.34 4.32 4.75 4.47
RMSD 2.53 4.57 1.53 2.51 4.53 1.53 2.51 2.96 2.68
MSD -2.32 -4.45 -0.62 -2.27 -4.39 -0.58 -1.69 -2.08 -1.89
MAD 2.38 4.48 1.29 2.35 4.41 1.28 1.99 2.34 2.15
RMSD 12.42 12.46 15.91 12.92 13.03 16.84 2.55 3.01 2.82
MSD -11.92 -11.83 -15.29 -12.30 -12.26 -16.02 -1.90 -2.27 -2.20
MAD 11.92 11.83 15.29 12.30 12.26 16.02 2.07 2.43 2.31

Note. MM represents mean/mean scale linking, MS represents the mean/sigma scale linking, and SL represents the Stocking and Lord TCC scale linking procedure. Due to the disparate index values across forms, results are shown for each form separately.

Boldface signifies values more favorable results with indices close to 0 per index per form (by row).


Preequating Method

Across the three equating methods paired with the three scale linking procedures, the results indicated that the mean/mean scale linking procedure with the preequating method performed the most favorably for three of the four forms ( ). For , the mean/mean preequating method resulted in lower bias and error in comparison to all other methods (RMSD = 8.02, MSD = -7.71, and MAD = 7.71), whereas the highest amount of bias was related to the Stocking and Lord TCC procedure paired with the IRT true score equating method (RMSD = 22.10, MSD = -21.62, MAD = 21.62). For , the mean/mean preequating method produced the most favorable results in comparison to all other methods (RMSD = 4.74, MSD = -4.31, and MAD = 4.32). However, the Stocking and Lord TCC scale linking procedure paired with the preequating method produced only slightly higher results than the mean/mean preequating method (within 0.5 raw score points). The small difference of 0.5 raw score points in RMSD, MSD and MAD between the scale linking procedures within the preequating method was present across all forms. For form 3, the mean/mean preequating method produced slightly higher RMSD in comparison to the Stocking and Lord true score equating method (RMSD = 2.51, RMSD = 1.53, respectively). These differences relate to a difference of about 1 raw score point. Therefore, the results from the mean/mean preequating method showed a slight improvement over the other scale linking procedures within the preequating method, although there was very little practical difference in the results across each scale linking procedure.

True Score and Observed Score Equating Methods

The results from the true score and observed score equating methods with each scale linking procedure were comparable across all forms. For , the true and observed score methods yielded very consistent results. Specifically, the mean/mean observed score method and the mean/mean true score method resulted in similar levels of error (MSD = -18.83, MSD =        -8.97, respectively) and the maximum deviation between raw scores in terms of MSD of the mean/sigma true score and observed score methods was approximately 0.35 raw score points. Furthermore for , the Stocking and Lord TCC scale linking procedure paired with the observed score and true score methods produced similar high amounts of error (RMSD = 9.400, 9.399, respectively). Unique to form 3, the Stocking and Lord true score and observed score methods produced the lowest bias (RMSD =1.53) and error (MAD =1.279 and MAD = 1.292, respectively) across all other conditions. The results from combining each scale linking procedure with the true score and observed score methods were varied across forms; in some cases, the Stocking and Lord TCC procedure performed least favorably ( ), while in other cases, the mean/sigma scale linking procedure performed the least favorably ( ). Therefore the findings are inconclusive in terms of the preferred scale linking procedure for the IRT observed score and true score methods, although the evidence suggests that the Stocking and Lord TCC procedure produced higher levels of errors for two forms.

Research Question 3

Overall, the mean/mean preequating method and the Stocking and Lord TCC preequating method performed the most favorably ( ). The true score and observed score methods produced similar levels of decision consistency, indicating not much practical difference.


Figure 10. Mean decision consistency indices,  (blue) and (red) across all forms. MM represents mean/mean scale linking, MS represents the mean/sigma scale linking, and SL represents the Stocking and Lord TCC scale linking procedure.


The average decision consistency indices across all ATA forms improved in comparison to baseline estimates using MTA forms. For example, the raw agreement index was 1% to 3% greater for ATA forms than MTA forms. For 3 of the 4 forms, the raw agreement index was the highest for preequating methods. Similar to the findings from research question 2, the results from the decision consistency evaluation indicated that the Stocking and Lord TCC scale linking procedure paired with the IRT true and observed score methods performed the most favorably for form 3 (see Figure 11). Results for each form and equating method are displayed in Appendix B.







In order to examine the differences in equating outcomes that transitioning from MTA to ATA introduces, I employed three scale linking procedures and three equating methodologies. I calculated the RMSD, MSD, and MAD in order to determine which combination of scale linking procedure and equating method resulted in the least amount of bias and error. The preequating method with the mean/mean scale linking procedure produced the most favorable results for three of the four forms, even when the number of common items did not meet recommended criteria. Lastly, results from the decision consistency analyses indicated that the preequating method outperformed the true score and observed score equating methods in terms of , however the true score and observed score methods produced the most favorable decision consistency in terms of . The variation in error terms of equated scores across forms suggests that MTA and ATA forms cannot be directly compared. If testing organizations begin to implement ATA for form assembly, they should give thoughtful consideration to the use of MTA forms as the base forms for equating purposes.

Research Question 2: Equating Methods and Error

Due to the nature of implementing IRT equating methods following scale linking, the results of the equating methods are based on the quality of the results on the scale linking procedures. The common item sets were sufficiently sized for only form pair 1 and form pair 3; therefore the generalizability of the equating results of forms 2 and 4 are limited. Yet all previous known item statistics were used for preequating, not just those included in the common item set.

It is important to note that for the purpose of this research raw scores were used to evaluate the error and bias associated with scores that were equated for each method. Overall, the optimal method for three of the four forms was the mean/mean scale linking procedure paired with the preequating method. The mean/mean preequating method produced the lowest amount of bias as measured by RMSD. While other methods produced slightly lower MSD or MAD, there was not an appreciable or practical difference in these values from others (typically less than 0.2 raw score points). For the two forms with sufficient common item sets, the mean/mean preequating method and the Stocking and Lord TCC true score methods produced the most favorable results. Similar favorable findings from the mean/mean preequating method were found across the remaining forms; meaning, despite having an insufficient amount of items for scale linking purposes, the results still supported preequating methods. This may be due in part to the similarity between the Rasch equating model and the mean/mean preequating method.

There were large differences in the magnitude of results of RMSD between true score, observed score and preequating methods across the forms. Specifically, form 1 had the highest values of RMSD across all equating methods, whereas form 3 had the lowest values of RMSD across all equating methods. The differences in RMSD across the forms provide additional evidence that the new and original forms were not built to the same statistical specifications.

Results from prior research indicated that preequating methods have performed poorly in comparison to postequating methods. Kolen and Harris (1990) reported that the IRT preequating method resulted in the highest values of RMSD and MSD in comparison to IRT postequating methods. Tong and Kolen (2005) compared the adequacy of equated scores from the traditional equipercentile, IRT true score, and IRT observed score equating methods using three criteria and found that the IRT true score method performed least favorably in comparison to the IRT observed score. In this respect, the results from the current study disagreed with previous literature. Yet, the goal of the current study was to evaluate differences in equating outcomes between MTA and ATA forms. At this point in time, no research studies have compared outcomes from different equating methods when testing organizations transitioned from MTA to ATA, therefore the lack of consistent findings with prior literature may be in relation to the change in test development procedures, the differences in psychometric framework (i.e., IRT 2-PL versus Rasch model) or differences in the nature and purpose of the testing program (i.e., K-12 versus medical licensure). For example, much of the body of literature on equating utilizes K-12 assessment programs to investigate differences in equating methodologies. K-12 assessment programs are built to different test specifications as the purpose of these examinations may be to evaluate and monitor student growth rather than passing or failing examinees. Often these types of assessment programs have different characteristics than medical licensure examinations, including shorter administration windows and mode of delivery. The difference in results can also be explained by the utility of a purposeful equating design. Although the testing program used for the current study did not operationally implement a CINEG design, I employed this design by selecting data that conformed to the design (i.e., requirements were met). Two of the four forms had sufficiently sized common item sets. Yet the favorable findings for the preequating method were in agreement for three of the four forms used. This may be due to the fact that the preequating methodologies relied on a quality bank of linked items rather on a small common item set.

The RMSD, MAD, and MSD are commonly used measures to gain an overall understanding of the differences between equated scores and those on the base form. Yet the standard error of equating is another commonly used approach to evaluate the adequacy of equating results and can be used to gain a better understanding of the error associated across the distribution of equated scores. The standard error of equating replicates hypothetical samples to approximate the standard deviation of each equated score (Kolen & Brennan, 2014). Future research can expand on this study by calculating the standard error of equating for the IRT true and observed score equating methods.

The results of this research question suggest that prior to implementing ATA, the equating design should be thoroughly considered in light of the purpose of the assessment. In agreement with the best principles of test assembly, any time new test development procedures are implemented, results processing should be carefully considered (AERA, APA, & NCME, 2014). The implementation and variation in test assembly procedures necessitates the need for reviewing and evaluating current psychometric procedures (e.g., standard setting, equating designs, etc.). A key recommendation for practitioners is to discontinue the use of MTA forms as the base form when an organization is newly implementing ATA procedures as the findings from this research suggest that there is more variation in the statistical specifications of MTA forms to support continual use as the base form.

Research Question 3: Equating Methods and Decision Consistency

Although there are many ways to evaluate decision consistency, decision consistency was measured using two estimates, the proportion of raw agreement ( ), and the kappa index ( ) which corrects the raw agreement index by what is expected by chance. Both the mean/mean preequating methods and Stocking and Lord TCC preequating methods produced the highest raw agreement indices across forms ( ). This finding supports the findings from research question 2, in which the lowest error and bias were found in the same methods. The raw agreement indices for other equating methods differed slightly within each form, typically within 1%. In comparison, the  estimates were inconsistent across forms (see Appendix B). Moreover, the decision consistency index of ATA forms was 1% to 3% greater than that of the MTA forms. This finding provides additional evidence that ATA enhances the psychometric properties of examinations that make pass/fail distinctions.

Although decision consistency is an important aspect for psychometricians to explore, it does not fully explain outcomes of examinations with pass/fail decisions. Specifically without simulation studies, where true ability is known, one cannot know for certain that decisions are accurate. Although there are ways to explore and provide evidence of decision accuracy, it was beyond the scope of the current study. Future research is warranted on different approaches like Lee (2010) to evaluate both decision consistency and decision accuracy when testing programs newly implement ATA.


The testing program did not implement a CINEG design operationally; however, the data easily lent itself to the implementation of a CINEG design based on the use of anchor blocks in ATA. Although I confirmed key equating requirements and controlled for others by carefully selecting the forms used in study, the common item sets were not a perfect representation of the content or statistical specifications to that of the entire test. Moreover, the pretest item design had also changed between MTA and ATA forms, which may have influenced the findings. Specifically, pretest item blocks were assigned randomly to examinees in Y forms, whereas pretest items blocks were embedded within each X form. Although understandable when using operational data, there are limitations in the findings. Lastly, the 0.3 logit criteria used to establish the common item set is operationally employed although there are alternative methods one can use to identify outlying items in the common item set. Therefore, future research is warranted to address the replicability of this study when considering the purpose of the examination and equating designs in conjunction with test design decisions (i.e. ATA).

In addition, the assessment used in this study has unique and complex characteristics (e.g., test specifications, blueprint, and constraints) that may limit the generalizability of the results. For example, the forms assembled using ATA programs involved approximately 70 content and statistical constraints (i.e., domain representation, life stage of patient, clinical setting, mean item p-value, etc.), and maximized the TIF to create parallel forms. Although MTA and ATA forms were built according to the same domain representation, in MTA other variables were not as controlled as they were in ATA. Furthermore, ATA employs over 70 standardized constraints, whereas test developers loosened the constraints one by one, form by form during MTA. It is expected that the similarity of constraints between ATA and MTA procedures influenced the findings of this study. The results of this study shed light on the enhancement in quality of ATA forms; however this improvement necessitates the reevaluation or reconsideration of continuing to use MTA forms as the base examination. Future research may address the similarities or differences in MTA and ATA procedures by simulating ATA conditions and assessing the outcomes from different equating methods. Future research would provide insight as to how the similarity (or differences) between assembly procedures may influence different outcomes from equating methodologies.

Prior researchers have developed a variety of models that can be used in order to implement optimal test design. A brief overview of the different models is discussed in van der Linden (1998). It is well-documented that forms developed via ATA produce more favorable psychometric properties than MTA due to the overall test design and the defining attributes of ATA (Luecht, 1998; van der Linden, 2005). This research study provides some evidence that ATA creates more parallel test forms, not only in terms of content and statistical specifications, but also with respect to test information, data-model fit, and decision consistency. Yet, very few studies have provided empirical evidence of the quality improvement of ATA over MTA. Due to the growing popularity of ATA, more research is warranted on the replicability of this study (e.g., simulation studies), on other psychometric advantages that result from implementing ATA, and on the application to assessment programs that have different purposes and test designs.


The widespread implementation of ATA procedures has alleviated the workload of test developers by allowing computer programs to create multiple parallel test forms with relative ease. ATA procedures provide an efficient and cost-effective alternative to assembling parallel test forms simultaneously. The integral psychometric goal of ATA is the minimization of the SEM and maximization of the test score reliability (van der Linden, 2005). However, ATA is not only a question of computer programs easing the workload, but rather if computer programs improve the psychometric quality of assembled test forms. The results presented in this research study provide empirical evidence of the improvement in psychometric qualities of ATA forms. Whenever testing organizations newly implement test development practices, it is important to evaluate the outcomes (AERA, APA, & NCME, 2014). Assessing the adequacy of score outcomes of various equating methods is one way to investigate the relationship between psychometric quality and new implementation of ATA programs. In this research study, I evaluated the adequacy of different equating methods by estimating the bias, error, and decision consistency associated with score outcomes of newly developed ATA forms.

The context of evaluating equating methodologies with respect to test assembly procedures is important in today’s operational psychometric work as many testing organizations move towards ATA. Although testing organizations may utilize item parameters estimated using the Rasch model or IRT models for ATA, no previous research has connected differences in test assembly procedures to outcomes of equating methods. The results of this study further support the importance of planning and aligning the psychometric procedures to the test development procedures. The findings of this study suggest that the error and consistency of scores were related to the similarity in statistical test specifications. ATA led to the development of parallel forms that had better psychometric properties and less variation in content and statistical specifications than test forms assembled manually.

The results indicated that despite the differences in statistical specifications, the mean/mean preequating method performed the most favorably. This finding may be explained by the alignment among the mean/mean preequating method, psychometric framework of the Rasch model, and that ATA utilizes the same item difficulties to build each form. Conceptually, the mean/mean preequating method is similar to the Rasch equating method, which anchors known item difficulties. Therefore, the mean/mean preequating method is aligned with the Rasch anchored equating method. Furthermore, because ATA utilizes the same known item difficulties to build forms that have similar TIFs that peak at the cut-score of the examination, all of these methods are complimentary and work in tandem. Future research should expand on these findings by investigating the outcomes from similar equating methodologies when ATA forms are used as the base forms.



Cited Literature

Ali, U. S., & van Rijn, P. W. (2016). An evaluation of different statistical targets for assembling

parallel forms in item response theory. Applied Psychological Measurement, 40(3),



Antal, J., Proctor, T.P., & Melican, G.J. (2014). The effect of anchor test construction on scale

drift. Applied Measurement in Education, 27(3), 159-172, doi: 10.1080/08957347.2014.905785


American Educational Research Association, American Psychological Association, National

Council on Measurement in Education, & Joint Committee on Standards for Educational and Psychological Testing (AERA, APA, & NCME). (2014). Standards for educational and psychological testing (pp. 75-94). Washington, DC: American Educational Research Association.


Armstrong, R. D., Jones, D. H., & Wang, Z. (1994). Automated parallel test construction using

classical test theory. Journal of Educational Statistics, 19(1), 73-90. doi:10.2307/1165178


Andrich, D. (2004). Controversy and the Rasch model: A characteristic of incompatible

paradigms? In Smith, E. V., & Smith, R. M. (Ed). Introduction to Rasch Measurement: Theory, models and applications. Maple Grove, MN: JAM Press.


Angoff, W. H. (1971). Scales, norms, and equivalent scores. In R. L. Thorndike (Ed.),

Educational measurement (pp. 508–600). Washington, DC: American Council on Education.


Babcock, B., & Albano, A. D. (2012). Rasch scale stability in the presence of item parameter and

trait drift. Applied Psychological Measurement, 36(7), 565-580. doi:10.1177/0146621612455090


Bock, D. R. (1997). A brief history of item response theory. Educational Measurement: Issues

and Practice, 16(4), 21-33.


Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. New York:

Holt, Rinehart, and Winston.


Debeer, D., Ali, U. S., & van Rijn. P. W. (2017). Evaluating the statistical targets for assembling

parallel mixed-format test forms. Journal of Educational Measurement, 54(2), 218-242.


Dorans, N., Moses, T. & Eignor D. (2010) Principles and Practices of Test Score Equating. ETS

RR-10-29. ETS Research Report Series.




Eignor, D. R. & Stocking, M. L. (1986). An investigation of possible causes for the inadequacy

of IRT preequating (Report No 86-14). Princeton, NJ: ETS Research Report Series. Retrieved from http://onlinelibrary.wiley.com/doi/10.1002/j.2330-8516.1986.tb00169.x/epdf


Embretson, S. E., & Reise, S. P. (2009). Item response theory for psychologists. New York, NY:

Psychology Press.


Gao, R., He, W., & Ruan, C. (2012). Does preequating work? An investigation into a preequated

testlet-based college placement examination using post administration data (Report No 12-12). Princeton, NJ: ETS Research Report Series.


Hambleton, R., Swaminathan, H., & Rogers, H. (1991). Fundamentals of Item Response Theory.

Newbury Park, CA: Sage.


Hambleton, R., & Slater, S. (1997). Item response theory models and testing practices: Current international status and future directions. European Journal of Psychological Assessment, 13(1), 21-28. doi: 10.1027/1015-5759.13.1.21


Hanson, B.A., Zheng, L., & Cui, Z. (2004a). PIE: A computer program for IRT equating

[computer program]. Iowa City, IA: education.uiowa.edu/centers/casma


Hanson, B.A., Zheng, L., & Cui, Z. (2004b). ST: A computer program for IRT scale linking

[computer program]. Iowa City, IA: education.uiowa.edu/centers/casma


Han, T., Kolen, M., & Pohlmann, J. (1997). A comparison among IRT true- and observed-score

equatings and traditional equipercentile equating. Applied Measurement in Education, 10(2), 105-121. doi:10.1207/s15324818ame1002_1


Hattie, J. (1985). Methodology review: Assessing unidimensionality of tests and items. Applied

Psychological Measurement, 9(2), 139-164.


Holland, P. W., & Dorans, N. J. (2006). Linking and equating. In Brennan, R. L. (Ed.),

Educational measurement (pp. 187-220). Westport, CT: Praeger.


Huynh, H. (1990). Computation and statistical inference for decision consistency indexes based

on the Rasch Model. Journal of Educational Statistics, 15(4), 353-368. doi:10.2307/1165093


Huynh, H., & Rawls, A. (2011). A comparison between robust z and 0.3-logit difference

procedures in assessing stability of linking items for the rasch model. Journal of Applied Measurement, 12(2), 96.


Karabatsos, G. (2017). Elements of psychometric theory lecture notes. Personal Collection of G.

Karabatsos, University of Illinois at Chicago, Chicago, Illinois.


Kolen, M. J. (1981). Comparison of traditional and item response theory methods for equating

tests. Journal of Educational Measurement, 18(1), 1-11.


Kolen, M. J., & Brennan, R. L. (2014). Test equating scaling and linking (3rd ed). New York,

NY: Springer.


Kolen, M. J., & Harris, D. J. (1990). Comparison of item preequating and random groups

equating using IRT and equipercentile methods. Journal of Educational Measurement, 27(1), pp. 27-30.


Lee, W. (2010). Classification consistency and accuracy for complex assessments using item

response theory. Journal of Educational Measurement, 47(1), pp. 1-17.


Li, D., Jiang, Y., & von Davier, A. A. (2012). The accuracy and consistency of a series of IRT

true score equatings. Journal of Educational Measurement, 49(2), 167-189. doi: 10.1111/j.1745-3984.2012.00167.x


Lin, C.-J. (2008). Comparisons between Classical Test Theory and Item Response Theory in

Automated Assembly of Parallel Test Forms. Journal of Technology, Learning, and Assessment, 6(8).


Linacre, J. M. (2017). Winsteps® Rasch measurement [computer program]. Beaverton, OR:



Linacre, J. M. (2017).  ). Fit diagnosis: infit outfit mean-square standardized. Retrieved from



Livingston, S. L. (2004). Equating test scores (without IRT). Princeton, NJ: ETS. Retrieved

from: https://www.ets.org/Media/Research/pdf/LIVINGSTON.pdf


Lord, F. M. (1977). Practical applications of item characteristic curve theory. Journal of

Educational Measurement, 14(2), 117-138. doi:10.1111/j.1745-3984.1977.tb00032.x


Lord, F. M. & Novick, M. R. (1968). Statistical theories of mental test scores. Redding, MA:

Addison-Wesley Publishing Company.


Lord, F. M., & Wingersky, M. S. (1984). Comparison of IRT true-score and equipercentile

observed-score “equatings.”. Applied Psychological Measurement, 8(4), 453-461. doi: 10.1177/014662168400800409


Luecht, R.M. (1998). Computer-assisted test assembly using optimization heuristics Applied

            Psychological Measurement, 22(3), 224-236. doi: 10.1177/01466216980223003.


Mead, R. (2008). A Rasch primer: The measurement theory of Georg Rasch. Psychometrics

services research memorandum 2008–001. Maple Grove, MN: Data Recognition Corporation.

Penfield, R. D. (2005). Unique properties of Rasch model item information functions. Journal of

Applied Measurement, 6(4), 355-365.


O’Neill, T., Peabody, M., Tan, R. J. B. & Du, Y. (2013). How much item drift is too much?

Rasch Measurement Transactions, 27(3), 1423-1424. Retrieved from: https://www.rasch.org/rmt/rmt273a.htm


RStudio Team (2016). RStudio: Integrated Development for R. RStudio, Inc. [computer

program]. Boston, MA Retrieved from http://www.rstudio.com/.


Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen,

Danish Institute for Educational Research. Chicago: The University of Chicago Press.

Smith, E. V., Stearus, M., Sorenson, B., Huynh, H. & McNaughton, T. (2018). Rasch DC

[computer program]. Unpublished, Department of Educational Psychology, University of

Illinois at Chicago, Chicago, IL.


Smith, E. V. (2004). Evidence of the reliability of measures and validity of measure

interpretation: A Rasch measurement perspective. In Smith, E. V., & Smith, R. M. (Ed.),

Introduction to Rasch Measurement: Theory, models and applications. Maple Grove,         MN: JAM Press.


Smith, E. V. (2005). Effect of item redundancy on Rasch item and person estimates. Journal of

Applied Measurement, 6(2), 147-163.


Smith, R. M. (2003). Rasch measurement models: Interpreting Winsteps and FACETS output.

Maple Grove, MN: JAM Press.


Smith, R. M. (1991). The distributional properties of Rasch item fit statistics. Educational

and Psychological Measurement, 51, 541-565.


Smith, R. M. & Kramer, G. (1992). A comparison of two methods of test equating in the Rasch

model. Educational and Psychological Measurement, 52, 835-846.


Stephens, M. A. (1974). EDF Statistics for Goodness of Fit and Some Comparisons, Journal of

the American Statistical Association, 69, pp. 730-737.


Subkoviak, M. J. (1985). Tables of reliability coefficients for mastery tests. Paper presented at

the Annual Meeting of the American Educational Research Association, Chicago, IL.


Stocking, M. (1991, April). An experiment in the application of an automated item selection

method to real data. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Boston, MA.


Taylor, C. S., & Lee, Y. (2010). Stability of Rasch scales over time. Applied Measurement in

Education, 23(1), 87-113. doi: 10.1080/08957340903423701



Tong, Y., & Kolen, M. J. (2005). Assessing equating results on different equating criteria.

Applied Psychological Measurement, 29(6), 418-432. doi: 10.1177/0146621606280071


Tong, Y., Wu, S-S., Xu, M. (2008, March). A comparison of preequating and post-equating

using large-scale assessment data. Paper presented at the Annual Conference of the American Educational and Research Association. New York City, NY.


van der Linden, W. J. & Hambleton, R. K. (1997). Handbook of modern item response theory.

New York, NY: Springer.


van der Linden, W. J. (2005). Linear models for optimal test design. New York, NY: Springer.



van der Linden, W. J. (1998). Optimal assembly of psychological and educational tests. Applied Psychological Measurement, 22(3), 195-211. doi: 10.1177/01466216980223001

von Davier, A. A. (2010). Equating and Scaling. In Peterson, P., Baker, E. & McGaw, B. (Ed.), International Encyclopedia of Education. Amsterdam: Academic Press, pp. 50-55.


von Davier, A. A. & Wilson, C. (2007). IRT true score test equating: A guide through

assumptions and applications. Journal of Educational and Psychological Measurement, 67(6); 940-957. doi: 10.1177/0013164407301543


Wightman, L. F. (1998). Practical issues in computerized test assembly. Applied Psychological

Measurement, 22(3), 292-302. doi: 10.1177/01466216980223009


Wilson, M. (1988). Detecting and interpreting local item dependence using a family of Rasch

models. Applied Psychological Measurement, 12, 353-364.


Wright, B. D. (1977). Solving measurement problems with the Rasch Model. Journal of

Educational Measurement, 14(2), 97-115.


Wright, B. D. & Mok, M. C. (2004). An overview of the family of Rasch Measurement

Models. In Smith, E. V., & Smith, R. M. (Ed.), Introduction to Rasch Measurement: Theory, models and applications. Maple Grove, MN: JAM Press.


Wright, B. D., & Linacre, M. (1994). Reasonable Mean-Square Fit Statistics. Rasch

Measurement Transactions, 8(3), 370. Retrieved from: https://www.rasch.org/rmt/rmt83b.htm


Yen, W. M. & Fitzpatrick, A. R. (2006). Item Response Theory. In Brennan, R. L. (Ed.)

Educational measurement, fourth edition. Portsmouth, NH: Praeger.


Yi, H. S., Kim, S., & Brennan, R. L. (2007). A method for estimating classification consistency

indices for two equated forms. Applied Psychological Measurement, 32(4), 275-291.







Form Equating Scale Linking  Error  Error
Baseline Comparison 0.949 0.006 0.682 0.033
OS MM 0.958 0.007 0.764 0.035
OS MS 0.955 0.007 0.736 0.035
OS SL 0.948 0.008 0.743 0.035
TS MM 0.958 0.007 0.763 0.035
TS MS 0.955 0.007 0.734 0.035
TS SL 0.948 0.008 0.745 0.035
PE MM 0.971 0.007 0.633 0.063
PE MS 0.971 0.007 0.641 0.061
PE SL 0.972 0.007 0.646 0.061
Baseline Comparison 0.958 0.006 0.657 0.040
OS MM 0.958 0.007 0.649 0.049
OS MS 0.967 0.006 0.714 0.048
OS SL 0.963 0.007 0.651 0.052
TS MM 0.958 0.007 0.652 0.049
TS MS 0.966 0.006 0.712 0.047
TS SL 0.963 0.007 0.651 0.052
PE MM 0.971 0.006 0.697 0.053
PE MS 0.969 0.006 0.697 0.052
PE SL 0.970 0.006 0.699 0.052
Baseline Comparison 0.947 0.006 0.647 0.034
OS MM 0.960 0.006 0.617 0.044
OS MS 0.954 0.007 0.631 0.040
OS SL 0.965 0.006 0.587 0.050
TS MM 0.960 0.006 0.612 0.044
TS MS 0.954 0.007 0.623 0.040
TS SL 0.965 0.006 0.580 0.051
PE MM 0.959 0.006 0.699 0.037
PE MS 0.958 0.006 0.709 0.036
PE SL 0.958 0.006 0.701 0.037
Baseline Comparison 0.935 0.009 0.719 0.034
OS MM 0.955 0.007 0.791 0.029
OS MS 0.955 0.007 0.793 0.029
OS SL 0.950 0.007 0.802 0.026
TS MM 0.956 0.007 0.803 0.028
TS MS 0.958 0.007 0.814 0.027
TS SL 0.952 0.007 0.818 0.025
PE MM 0.963 0.007 0.678 0.044
PE MS 0.963 0.007 0.697 0.042
PE SL 0.963 0.007 0.686 0.044

Note. Boldface signifies maximum value. MM represents mean/mean scale linking, MS represents the mean/sigma scale linking, and SL represents the Stocking and Lord TCC scale linking procedure. OS represents IRT observed score equating, PE represents IRT preequating, and TS represents IRT true score equating.

Association of Standardized Patient Annual Conference, June 9-11, 2019

Standardizing Judgment: A Qualitative Study of How SPs Co-Construct Meaning

This presentation reported on the results of a discourse analysis of 22 Standardized Patient (SP) interviews. The research received IRB approval through the University of California, San Diego. The research questions were: 1) How do SPs maintain “standardization” in role performance and assessment, 2) to what degree to SPs adhere to standardization? The results concluded that 1) the term “standardization” is co-constructed by test developers, psychometricians, SP trainers, and SPs, 2) SP trainers employ non-standardized approaches in their training, and 3) SPs are highly invested in maintaining a standard of role portrayal and assessment but the personal resources they bring to it are highly subjective.

PHILADELPHIA, PA. The National Board of Osteopathic Medical Examiners (NBOME), an independent, not-for-profit organization that provides testing for osteopathic medical licensure and related healthcare professions, is excited to mark 2019 as the 85th anniversary of its founding.

Since its founding in 1934, and particularly in recent years, the NBOME has experienced substantial growth and opportunities given the many changes in healthcare, medical education and medical regulation. The NBOME’s assessment portfolio has grown to encompass the continuum of education, training, and continuous professional development for osteopathic medicine and a number of other healthcare professions, both in the United States and in other countries.

“We continue to be steadfast in our commitment to the NBOME’s mission and producing valid, reliable, and defensible national standardized assessment tools, and acknowledging the critical role that osteopathically distinctive assessment plays in protecting the public,” said John R. Gimpel, DO, MEd, NBOME President and CEO, “We are thankful for the strategic direction and leadership our Board of Directors has provided throughout our history, and the unparalleled support NBOME receives from the osteopathic medical community and our numerous other partners in health care.”

Coinciding with the 85 year celebration is the launch of NBOME’s enhanced blueprint for the COMLEX-USA examination series. This multi-stage update to the organization’s flagship exam series is the culmination of nearly 10 years of work in evidence-based design led by NBOME’s National Faculty, including subject matter experts from the education, licensure and clinical practice communities. In addition, NBOME has expanded the COMAT product portfolio with the Foundational Biomedical Sciences exam series, and also introduced the CATALYST longitudinal assessment platform for formative testing and continuous professional development.

A provisional logo and accompanying information resources focused on NBOME’s 85 year legacy, including as a dedicated webpage and social media campaign, are planned for the remainder of the year.

About the NBOME 

NBOME is an independent, non-governmental, non-profit assessment organization committed to protecting the public by providing the means to assess competencies for osteopathic medicine and related health care professions. NBOME’s COMLEX-USA examination series is a requirement for graduation from colleges of osteopathic medicine and provides the pathway to licensure for osteopathic physicians in the United States and numerous international jurisdictions.

The NBOME congratulates former board member Ronald Burns, DO, on his new appointment as president of the American Osteopathic Association on Saturday, July 27th 2019.

Dr. Burns assumed the presidency before an estimated 500 osteopathic physicians (DOs) at the American Osteopathic Association’s annual business meeting in Chicago. The organization represents the professional interests of the nation’s more than 145,000 DOs and osteopathic medical students.

From 2009-2018, Dr. Burns served on the NBOME Board and participated in the COMLEX-USA Level 3 Advisory Committee, Executive Committee, and chaired the Nominating Committee. He has also previously served on the Cognitive Testing Advisory Committee.

Dr. Burns is a board-certified family medicine physician, with a private practice in Orlando, Florida. He has represented the state of Florida in the AOA House of Delegates for the past 19 years. He has also been a member of the Florida Osteopathic Medical Association for over 29 years and served as its president from 2004­–2005.

“My goal is to ensure the AOA is ready to meet the current and future needs of its ever-growing body of constituents,” said Dr. Burns. “By necessity, that means having strong collaborative relationships with AOA’s affiliate organizations.”

We wish Dr. Burns best of luck, and look forward to working with our longtime colleague in his new position.

Read more about Dr. Burns’ role as AOA President.



Mathematical programming has been widely used by professionals in testing agencies as a tool to automatically construct equivalent test forms. This study introduces the linear programming capabilities (modeling language plus solvers) of SAS Operations Research as a platform to rigorously engineer tests on specifications in an automated manner. To that end, real items from a medical licensing test are used to demonstrate the simultaneous assembly of multiple parallel test forms under two separate linear programming scenarios: (a) constraint satisfaction (one problem) and (b) combinatorial optimization (three problems). In the four problems from the two scenarios, the forms are assembled subjected to various content and psychometric constraints. Assembled forms are next assessed using psychometric methods to ensure equivalence about all test specifications. Results from this study support SAS as a reliable and easy-to-implement platform for form assembly. Annotated codes are provided to promote further research and operational work in this area.


To read this article as it was initially published in Applied Psychological Measurement click here.


Contributed by:

Can Shao | Research Scientist | Curriculum Associates

Hongwei “Patrick” Yang | Assistant Professor | The University of West Florida

 Silu Liu | Epidemiological Research Manager | DCHealth Technology

Tsung-Hsun “Edward” Tsai, PhD | Associate Vice President of Assessment Services and Research | NBOME

In addition to sitting on our Board of Directors at the NBOME, and frequently contributing test items for COMLEX-USA Level 3 and the EM COMAT, Elizabeth McMurtry, DO, FACEP, serves as Assistant Dean for Clinical Education and Faculty Development for Pacific Northwest University of Health Sciences College of Osteopathic Medicine (PNWU-COM) in Yakima, WA. She also practices emergency medicine across Kennewick, Pasco, and Richland, otherwise known as Washington’s Tri-Cities area.



The NBOME caught up with Dr. McMurtry in her Airstream Interstate, somewhere in the vicinity of Walla Walla, Washington, where we had a chat to learn a little more about her career and community — and her Airstream.

NBOME: What drew you to the Pacific Northwest?

EM: If you’re a fan of the outdoors like me, there’s a lot of it here. I love camping, hiking, whitewater rafting, etc. Walla Walla’s climate is suited to outdoor adventures, and it’s a wine town; plus, it’s a lot of fun to do some of the cultural things that revolve around a tourist town. By the way, the Northwest’s wine country happens to rival any other wine area in the world.


NBOME: Good to know! So what is the osteopathic medical community like in your area?

EM: Osteopathic medicine in Walla Walla is on the rise overall, thankfully. Unfortunately there isn’t much in the way of osteopathic manipulative treatment (OMT) available to patients here. We have plenty of DOs, but there aren’t many focusing and practicing OMT. So if you’re a DO who wants to come to a more rural town that’s got a whole lot going on and you want to practice OMT, Walla Walla is wide open.


NBOME: What drew you to emergency medicine specifically?

EM: In emergency medicine you have to approach patients very holistically, whatever their critical issue is. When they come into the emergency department we have to consider their cultural needs, their home environment, their ability to pay for medications or therapies, whether they need to be hospitalized, etc. There’s a lot that goes into it from a whole-person treatment standpoint. That was always very appealing to me as an osteopathic physician.


NBOME: There’s been some buzz around the office about your Airstream motorhome. Tell us a little more.

EM: I practice at much larger facilities about an hour and a half away in the Tri-Cities. Between working my clinical shifts, traveling for my full-time job with PNWU-COM’s clinical education team, and my NBOME Board obligations, I’m away from home about half of the month every month — if not more. For years I’ve had an RV of some sort or another, but I bought this one a couple years ago to be a little more comfortable while traveling and have more flexible working shifts.


NBOME: Aside from the convenience, do you think the Airstream has any benefits to your practice?

EM: Absolutely. It allows me to stay in the area, rather than travel back and forth. That allows me to stay more involved in my facility and be more supportive of my colleagues. In my free time, it also lets me get the lay of the land in the communities I’m working in. Sure it’s fun to go play after work, but when we’re talking about holistic, whole-person-care, it’s so important to understand the environment your patients are coming from.


NBOME: Anything else you want to share?

EM: 1. Walla Walla is ready for OMT. 2. It has some of the best wine in the world. 3. Van life rocks!


Elizabeth A. McMurtry, DO, of Washington State, is an educator and clinical leader in emergency medicine. She serves as assistant dean for clinical education and faculty development for Pacific Northwest University of Health Sciences, where she is devoted to improving community-based education and enhancing the clinical learning experience. Dr.McMurtry has extensive experience precepting medical students and residents in community emergency departments, and she enjoys sharing successful models of education with community clinicians.

2020-2021 COMLEX-USA Bulletin of Information

Printable PDF


In 1989, the NBOME embarked on a journey to convert the NBOME Part I, Part II, Part II Examinations to the new Comprehensive Osteopathic Medical Licensing Examination for the United States – what we all know today as COMLEX-USA.   The journey continues.

The Catalyst.  Our intent was to move away from conventional content groupings and “basic sciences” that were common with national physician licensing exams of the time.  These exams typically organized test content around disciplines, but evidence from the learning sciences had emerged that regrouping content around patient presentations would likely be more relevant to clinical practice, and ultimately produce a more valid test for licensure.

As part of the development process, members of the NBOME Board studied osteopathic physician practice patterns with an eye to create alignment between what was tested, and what occurred in actual practice.  Research demonstrated differences in patient presentations to DOs as compared to those to MDs, further necessitating the need for a separate and valid licensure assessment. This new, unique testing paradigm looked at both the manner in which patients presented to DOs, as well as what physician tasks were employed to care for the patients, with enhanced focus on high-frequency, high-impact clinical presentations, important for patient safety and quality osteopathic medical practice.

This new and novel approach to examination design remains as valid today as it did when it was first implemented in 1995.

The Hallmarks.  While Level 1 focused on basic science concepts and osteopathic principles related to patient presentations and physician tasks, you’ll notice these foundational biomedical science and osteopathic concepts – the “basic sciences” – thread through all three levels to varying degrees.

COMLEX-USA was designed to directly align to the educational program leading to the DO degree, with developmentally appropriate content and physician tasks assessed at each Level.  It also adopted a stage-gate, developmental approach, requiring Level 1 be passed before eligibility was granted for Level 2, and Level 2 passed before Level 3 was attempted.

This garnered considerable attention in the osteopathic medical education and accreditation circles such that the AOA-Commission on Osteopathic College Accreditation expanded their graduation requirements for the DO degree to mandate passing COMLEX-USA Levels 1 and 2.

The Milestones.  COMLEX-USA Level 3, the first of the new series, launched in 1995 and was followed closely by the debut of Level 2 in 1997, and Level 1 in 1998.

In the decade that followed, a national, standardized patient-based, clinical skills examination for COMLEX-USA Level 2-PE was added in 2004.  And the transition to computer-based testing and one-day administration at professional test centers began in 2005.

Computer-based testing enabled us to expand from two administrations per year, per exam level to eventually more than 60 administrations per year now for Level 1, for example. It also enabled enhancements to score reporting, test security and integrity, as well as novel test items featuring multimedia to enhance authenticity, which were introduced in 2006.

While test blueprints are modified regularly to reflect the evolving practice of osteopathic medicine, the most major enhancement of the COMLEX-USA Master Blueprint came in 2018-2019, spearheaded by the Blue Ribbon Panel on Enhancing COMLEX-USA 2010-2016. This expanded the Level 3 to two days, with increased number of clinical decision-making, key features scenarios that focus on patient safety.  Attestation by residency program directors that Level 3 candidates were in good academic and professional standing was added in 2018. It also introduced the new competency-based blueprint, converting physician tasks to competency domains with required elements and measured outcomes, while expanding the detail provided about the clinical presentations likely to be assessed.

The COMLEX-USA examination program continues to evolve to align with the practice of osteopathic medicine as we work to lead the way in valid, reliable, defensible and fair assessment for osteopathic physician licensure, helping to keep our patients safe.  The COMLEX-USA journey continues.


Contributed by John R. Gimpel, DO, MEd | President & CEO  |  NBOME

PHILADELPHIA, PA. The National Board of Osteopathic Medical Examiners (NBOME), an independent, not-for-profit organization who provides testing for osteopathic medical licensure, today announced Thomas A. Cavalieri, DO, as the recipient of their 2019 Santucci Award. The award recognizes sustained outstanding contributions to the NBOME’s mission of protecting the public by providing the means to assess competencies for osteopathic medicine and related health care professions.

At their midyear Board of Directors meeting, celebrating their 85th anniversary as an organization, the NBOME honored their past-chair and longtime National Faculty leader, Dr. Thomas A. Cavalieri.

“It is an honor and a privilege to celebrate the life and work of a man who has had such an impact on our mission and our profession,” said current NBOME Board Chair, Dana Shaffer, DO. “His work and dedication have gone a long way toward producing superior doctors and outcomes for patients across the US.”

Dr. Cavalieri was first recruited in the late 1980s as an exam item writer by fellow NBOME past-chairs, Frederick G. Meoli, DO and Thomas Santucci, DO. He became Board President in 2001, and Board Chair in 2002 when the NBOME first hired a full-time physician President and CEO. During this time, he oversaw the launch of the COMLEX-USA Level 2-Performance Evaluation.

Dr. Cavalieri has led various NBOME committees, chairing the COMLEX-USA Level 3 examination committee as well as the Standards and Assurances committee. Additionally, he was a principle author of the published manuscript, “The Predictive Validity of Osteopathic Medical Licensing Examinations for Osteopathic Medical Knowledge Measured by Graduate Written Examinations.”

“Dr. Cavalieri led the NBOME at a pivotal time in our history,” said NBOME President & CEO, John R. Gimpel, DO, MEd. “His leadership has set an example of steadfast commitment to excellence and osteopathic distinctiveness that he has continued long past his tenure at our organization.”


About the NBOME
NBOME is an independent, non-governmental, non-profit assessment organization committed to protecting the public by providing the means to assess competencies for osteopathic medicine and related health care professions. NBOME’s COMLEX-USA examination series is a requirement for graduation from colleges of osteopathic medicine and provides the pathway to licensure for osteopathic physicians in the United States and numerous international jurisdictions.


In honor of Father’s day, we’d like to recognize those among our Board, National Faculty, and Staff, who generously share their time between the NBOME family, and families of their own. Balancing work and family is no small feat, and we truly appreciate the commitment and dedication this requires.

Let this be a weekend of reflection and gratitude to those who have been fathers, who are fathers, or who have served as a father figure for someone who needed one. We wish you a weekend of hard-earned relaxation and appreciation for all that you DO. Happy Father’s Day!


John R. Gimpel, DO, MEd | President and CEO | NBOME

WHEN ASSESSING MEDICAL schools, aspiring physicians should investigate how students at their target schools perform on American national medical licensing exams, experts say.

Licensing exam scores determine whether someone is legally permitted to become a U.S. doctor, experts say. Dr. John R. Gimpel, the president and CEO of the National Board of Osteopathic Medical Examiners, says the purpose behind medical licensing exams is to ensure that everyone who practices medicine is competent to do so. “They’re all about protecting patients and protecting the public,” Gimpel says.

Gimpel emphasizes that med school hopefuls shouldn’t use licensing exam scores as the be-all and end-all deciding factor when choosing a med school, since the exams aren’t built to assess the quality of a school. He adds that there are many other considerations beyond licensing exam scores that prospective med students should weigh carefully, such as whether the mission of a school aligns with their career goals and whether the school has a top-notch teaching faculty.

However, when a med school consistently posts licensing exam scores above the national average, that is a reassuring sign, Gimpel says. “That helps to show that their graduates are at least succeeding well in passing their licensing exams and getting high scores, and that’s usually going to mean that those doctors are going to be successful in getting into the residency programs that they’d like to and into the career paths that they like,” he says.

The most crucial licensing exam statistics for a prospective medical student to know about each of the med schools he or she is considering are the pass rates among first-time test takers for the first two parts of either the allopathic or osteopathic medical licensing exams, experts say. When licensing exam pass rates dip below 90%, that is cause for concern, says Dr. G. Richard Olds, president of St. George’s University, an international academic institution that includes a medical school and has a main campus located in the West Indies. Pass rates hovering around or approaching 70% are particularly troubling because med schools with exceptionally low licensing exam pass rates run the risk of losing their eligibility to receive federal student aid funding, he adds.

Why Passing Licensing Exams Is Critical

Students and recent graduates of M.D. programs at allopathic medical schools must pass the United States Medical Licensing Examination, which is commonly known as the USMLE, in order to become licensed doctors who can work independently in the U.S. Similarly, anyone attending a D.O. program at an osteopathic medical school will need to pass the Comprehensive Osteopathic Medical Licensing Exam of the United States, which is commonly known as COMLEX-USA, in order to practice medicine without supervision in America.

Both the USMLE and the COMLEX-USA are three-part exams, and each part is typically completed at a specific moment within a medical trainee’s yearslong education in medicine. The first part is usually taken after the first two years of medical school, while the second portion – which includes both a written knowledge test and a hands-on clinical skills test – is ordinarily completed during the latter half of medical school, with the third and final part reserved for the first or second year of a medical residency.

The licensing test results that are most relevant for vetting medical schools are the scores and pass rates for first two portions of the USMLE and COMLEX-USA exams, according to experts. Those are the two components of the licensing exams which are designed to judge whether medical students have gained sufficient knowledge and skills through their medical school curriculum, experts explain. Meanwhile, the last part of these exams assesses the competencies of medical residents.

Among the 110 ranked medical schools in the U.S. News Best Medical Schools rankings that reported their USMLE pass rates for first-time test takers during the 2016-17 school year, the average pass rate for Step 1 of the exam was 96.3%, whereas the average pass rate for the Step 2 clinical knowledge test was 96.6%. Meanwhile, according to the National Board of Osteopathic Medical Examiners, which administers the COMLEX-USA exam, the average pass rate among first-time test takers during the 2016-17 academic year was 92.7% for Level 1 of the exam and 93.2% for the Level 2 Cognitive Evaluation portion of the exam.

How Licensing Exams Factor Into the Residency Match Process

Licensing exam scores influence whether someone is a competitive candidate for a desirable residency in his or her dream medical specialty, physicians say. Exam scores are one factor among many that U.S. residency directors consider during the selection process for their residency programs, according to doctors.

Students with high licensing exam scores have a greater chance of obtaining highly coveted residency placements. Meanwhile, students with low scores may struggle to find a residency match, experts explain.

Some osteopathic medical students opt to take both the USMLE and the COMLEX-USA so they can maximize their competitiveness for selective residency programs, but doing so is not mandatory, experts say. Late last year, the American Medical Association’s House of Delegates unanimously approved a resolution which states that the USMLE and the COMLEX-USA should be regarded as equivalent during the residency match process, and neither exam should be favored over the other.

Why You Should Put Licensing Exam Scores in Context

Olds says medical school hopefuls should not automatically assume that schools with slightly higher scores and pass rates on these exams are superior to institutions with marginally lower numbers. There are certain factors that can heavily influence those statistics that have nothing to do with the quality of a school’s curriculum, he says.

“The school shouldn’t be taking students in that they’re not prepared to help be successful, especially because most kids who make it into med schools are highly qualified,” he says.

Another important consideration to keep in mind when examining licensing exam results is that these results are at least partially influenced by the type of student a school usually admits, Olds says. If a med school strongly considers MCAT scores and undergraduate GPAswhen making admissions decisions and the vast majority of its students have stellar scores and grades, their typical admitted student is predisposed to perform well on the medical licensing exam without requiring much academic assistance. In contrast, he says, a school which enrolls a significant number of students with mediocre or low MCAT scores and average or poor GPAs will likely need to provide significant academic help to its students to ensure that they pass licensing exams.

Olds says prospective med students should see how a school’s licensing exam scores compare with the MCAT scores of its incoming students. If a school’s average MCAT score is significantly more impressive than its average licensing exam scores, that may indicate that a school’s med students aren’t reaching their full potential, he says. On the other hand, if a school’s average licensing exam score is equally as good or superior to its average MCAT score, that is a positive indication that a school is helping its students grow their knowledge and skill sets. “You want them to help students outperform what would be predicted,” he says.

Gina Moses, the director of admissions recruitment at the New York Institute of Technology College of Osteopathic Medicine, says prospective medical students can often find information about how a medical school’s students perform on licensing exams in the outcomes section of school websites. Another place to look is in school catalogs, she says.

Moses says med school hopefuls should look up both the average licensing exam scores and the pass rates at the med schools on their short list in order to decide where to apply. Still, she cautions that the more important figures are the pass rates, which are the “better indicator of student success,” she wrote in an email. It’s important for prospective students to consider these figures alongside other important criteria for picking a med school, such as student support services, cost and accreditation, she says.

“The numbers must be viewed as part of the whole description of the medical school,” she says. “The ideal environment will be different for each applicant so an applicant will need to investigate all information about the school so as to best evaluate his or her options and narrow his or her application choices.”


To read the article as it originally appeared in US News & World Report, please click here.

About the author.  Ilana Kowarski is a reporter for U.S. News, where she covers graduate school admissions. She produces advice content for applicants to MBA programs, law schools and medical schools.

This year, the NBOME celebrates the 85th anniversary of its founding. Not surprisingly, this will be a recurring topic in our various published articles and communications as we reflect on key elements in our rich history and commemorate this significant anniversary milestone.

But let’s start at the beginning — way back in 1934, when three forward thinking American Osteopathic Association (AOA) presidents came together with a keen understanding of the need for a profession, such as osteopathic medicine, to self-regulate and set its own standards.

Their collaboration of Arthur G. Hildreth, DO, Charles Hazzard, DO, and Asa Willard, DO, led to the creation of the National Board of Examiners for Osteopathic Physicians and Surgeons (the original name of the NBOME). The organization’s first president, Dr. Hazzard, was a graduate of Northwestern University and one of the first to graduate from the American School of Osteopathy (ASO), now A.T. Still University – Kirksville College of Osteopathic Medicine. He was also a member of the New York State Board of Medical Examiners and served as AOA President in 1903.

But first, let’s set the stage with some fun facts about the early 1930s, though it’s not a decade often connected with the word “fun”. 1933 marked the end of Prohibition, the Master’s Golf Tournament first teed off in 1934, and the Chicago Blackhawks won the Stanley Cup, also in 1934.

Sadly, this was also the era of the Great Depression, marked by downward economic spiral, stock market devaluation, bank failure, and extraordinary unemployment. From this era of depression, heightened regulation emerged, with President Franklin D. Roosevelt’s signing of the U.S. Securities and Exchange Commission in 1934 and the Social Security Act in 1935. This New Deal legislation represented a new role for government in American life, offering new protections and a safety net to millions of suffering people.

Medical education and regulation was in its early stages in the 1930s, where the impact of the Carnegie Foundation’s infamous Flexner Report of 1910 was having a major impact on the nation’s medical schools. The report called for enhanced standardization of basic science content instruction as well as more rigorous pre-matriculation requirements for those entering U.S. Medical Schools.

In the 1930s, each state granted physicians the license to practice using their own separate licensure exams, with many adopting laws requiring basic science examinations. And while osteopathic medical colleges expanded their curricula to meet these basic science requirements, AOA pioneers began to recognize the importance and also the responsibility of the profession to self-regulate and set its own standards. One of the critical elements, in their view, was to establish and maintain regulation and a strong professional identity, both for the good of the profession and the patients they serve.

As Dr. Hazzard explained, “The difficulties, inconvenience and expense” of having to take separate state board examinations with different requirements could be overcome by having one standardized osteopathic medical licensing exam designed with the highest standards possible.

By this time, in addition to use of hands-on diagnosis and osteopathic manipulative treatment (OMT), DOs were still criticized by some traditional physicians for their “body-mind-spirit” approach to caring for patients, and leaders of the profession recognized the role that distinctive assessment played in shaping the kind of physician envisioned by its profession’s founders.

“With independent regulation,” wrote Willard in 1932, “we will ultimately get every right for ourselves and those to come.” This indeed came to pass, as in 1937 only 26 states agreed to extend practice privileges to DOs commensurate to those granted to MDs — and none recognized NBOME’s exams!

Fast forward 68 years later to 2005, when Louisiana finally permits full and unrestricted practice rights to DOs, thus making them the last U.S. State to accept NBOME’s COMLEX-USA for physician licensure.

I believe the NBOME Founders would be very proud of what they started, and recognize the many longstanding contributions made by the NBOME Board, the National Faculty, and our amazing staff. They would be pleased with the positive impact NBOME assessments have on teaching, learning, and professional identity, as well as its importance to the quality of care when patients interact with what is now over 150,000 DOs and osteopathic medical students.

Stay tuned for more “history lessons” in the coming weeks and months as we continue our celebration of our momentous 85 years!


Contributed by John R. Gimpel, DO, MEd  |  President and CEO  |  NBOME

On May 20, 2019, the COMLEX-USA Level 1 examination began its 2019-2020 test cycle. This is the first Level 1 cycle to fully align with the enhanced COMLEX-USA Master Blueprint.

New COMSAE Phase 1 Forms

To prepare for this event, we released new COMSAE Phase 1 forms in January 2019 to provide students with a reliable self-assessment tool. The new forms allow students to experience COMLEX-USA style test items and practice with the examination timing.

For further preparation, we have also released new COMLEX-USA Level 1 Practice Exams available here.

Interpreting COMSAE Phase 1 Scores

Following the implementation of the new blueprint, and administration of the enhanced COMLEX-USA Level 1 examination, a new passing standard will be set and applied to COMLEX-USA Level 1 administrations that began in May 2019.After the first several score releases of COMLEX-USA Level 1 beginning mid-July 2019, NBOME will conduct a comprehensive evaluation of scores on COMSAE Phase 1 and subsequent scores on Level 1. Following this evaluation, we will provide additional information to enhance interpretability of COMSAE Phase 1 scores and Level 1 scores.

Score Release Dates

The score release dates for candidates taking COMLEX-USA Level 1 from May 20-June 14, 2019, are anticipated to be longer than usual to allow sufficient time for the standard setting process to set a minimum passing score and statistically validate candidate performance for the new Level 1 testing cycle. Thereafter, we anticipate a normal reporting time for Level 1 score releases, subject to other delays contingent upon candidate scheduling, standard setting or other variables. The scores for tests taken May 20-June 14, 2019, scores are scheduled to be released July 16-18, 2019. Further score release dates can be found here.

Sample Score Report

We’ve posted a sample score report, which you can find in your NBOME account under “Student Scores”. Candidates and Schools can use this sample report to become familiar with the new format of the COMLEX-USA Level 1 candidate score report with the implementation of the Master Blueprint. This report will be implemented for all Level 1 administrations beginning May 20, 2019. The previous candidate score report will remain available for Level 1 administrations prior to May 1, 2019.

Update on the 2020-2021 Test Cycle

To accommodate an earlier initial score release in late June 2020, the 2020-2021 COMLEX-USA Level 1 test cycle will launch on May 4, 2020. This will allow sufficient time for statistical validation on items and candidate performance. Consequentially, there will likely be limited COMLEX-USA Level 1 testing dates in April 2020.


We are pleased to share the latest edition of the NBOME Annual Report.

As we mark our 85th anniversary, we are reminded of the many contributions and collective efforts made by our staff, Board, National Faculty, volunteers, clients and partners. Without your support and dedication, 2018 would not have been as successful as it was for the NBOME. Thank you for helping to make this another historic year and for furthering our mission to protect the public! Please take a moment to review the report and read about our shared achievements.

A supplement to the 2018 Annual Report is also available on our website that includes National Faculty contributions. If you would like us to send a copy to someone in your network, please let us know.

Launched  |  COMLEX-USA  |  New Level 2-PE Exam

As part of our transition of COMLEX-USA examination series to a contemporary, two-decision point, competency-based blueprint and evidence-based design model, we are pleased to announce the launch of the new COMLEX-USA Level 2-PE examination in March 2019.  This is the second exam to align with the enhanced COMLEX-USA Master Blueprint, after the Level 3 release in September 2018.

Click here for more information on COMLEX-USA Level 2-PE.


Coming Soon  |  COMLEX-USA  |  New Level 1 and Level 2-CE Exams

The roll-out of the enhanced Master Blueprint continues through 2019, with a planned update to Level 1 launching in May and Level 2-CE in June. To accompany these updates, we’ve added new content to our website related to COMLEX-USA Level 1, Level 2-CE and Level 2-PE to align with the Master Blueprint and test specifications. This enhanced content also includes new practice exams for Level 1 and Level 2-CE. Current versions of the COMLEX-USA Content and practice exam versions will be retired in May and June, respectively.

The launch of the enhanced COMLEX-USA is the culmination of nearly 10 years of work and evidence-based design by experts and leaders from across the organization and the country who contributed in all areas to the creation and deployment of this state of the art assessment.

Launched  |  COMAT-FBS  |  Comprehensive

In March 2019, the NBOME officially rolled out COMAT Foundational Biomedical Sciences (FBS) Comprehensive, following a successful pilot program that began in December 2018. The comprehensive examination covers 10 body systems and 6 disciplines — all in one 250 item, 5 hour exam.

Designed to assess first and second year osteopathic medical student (OMS I, OMS II) progress in basic biomedical sciences, COMAT FBS Comprehensive provides summative assessment of osteopathic medical student knowledge, skills and abilities in the foundational biomedical sciences, as well as helps students prepare for COMLEX-USA Level 1. The FBS Comprehensive examination may be used for end-of-course assessment for students enrolled at a college of osteopathic medicine (COM). Individual COMs may also administer the examination at other times in accordance with their curriculum goals and mission. This examination emphasizes core knowledge and elements of osteopathic principles and practice in the foundational biomedical sciences disciplines that are essential for the pre-doctoral osteopathic medical student.

Click here for more information on COMAT FBS Comprehensive.

Coming Soon  |  COMAT FBS  |  Targeted

The COMAT Foundational Biomedical Sciences (FBS) Targeted examinations will provide a means to assess student knowledge and skills related to a particular discipline or organ system throughout the first two years of medical education. This targeted approach will enable schools to customize their portfolio of COMAT examinations to best meet the evolving assessment needs and requirements as they pertain to osteopathic medical student knowledge, skills and abilities.

Each 90-minute COMAT FBS Targeted exams will consist of 62 items with content specific to each of 8 body systems and 6 disciplines.

More information on COMAT FBS Targeted will be available soon.

Results from the National Resident Matching Program (NRMP) Main Residency Match, combined with the AOA Match, the Military Match, and sub-specialty match programs, show that 98% of all 2019 graduates from the nation’s DO-granting medical schools matched into graduate medical education (GME) residency programs, according to the American Association of Colleges of Osteopathic Medicine.  While final numbers are expected to be available later this spring, the overall success of DO students and residents is spectacular news for the nation’s colleges of osteopathic medicine and related stakeholders.

Despite this historic success, the way in which DO and MD medical students pursue residency training positions in the United States is under fire. And the complex multi-layered system that connects MD and DO medical students to residency programs appears to be operating at a far cry from eHarmony.

Challenges and Potential Opportunities with the Current Match System

Often people refer to the NRMP as “The Match,” but in 2019 there still exist other mechanisms by which applicants may pursue graduate medical education (GME) positions, including the AOA Match and Military Match. Starting in 2020, however, programs that formerly used the AOA Match and have achieved ACGME Accreditation will now use the NRMP Match as part of the Single Accreditation System. Residency applicants will await results from the NRMP Match that culminates on a single day with the simultaneous announcement of binding three- to five-year training/job commitments and program specialties for some 40,000 students and doctors.

Both residency programs and applicants face stressful complications from the match process as it exists today. It’s common for MD and DO applicants to apply to 50-150+ residency programs because of the perceived competition to get into a residency program and the ease of electronic applications. After candidates apply, programs extend interview invitations in September through November, sometimes requiring responses within an hour by applicants for them to avoid losing the opportunity. Students spend considerable time and money traveling to interviews, and sometimes are pressured to declare their intent to rank a specific program #1 on their list prior to the program director submitting their rank list to NRMP. To sort through an increasing number of applicants, program directors resort to COMLEX-USA and USMLE exam scores to filter through applicants, despite knowing that neither licensing exam was designed with that as its primary purpose.

The American Medical Association (AMA) and other stakeholders are demanding changes to the system. Numerous suggestions have been put forward — from traffic rules for granting interviews, to pre-signaling program preferences by applicants.  In response, the AMA, the Federation of State Medical Boards (FSMB), the National Board of Medical Examiners (NBME), and others (including the NBOME), met to discuss the pros and cons of possible changes to licensure exam scoring and reporting.  Conversations explored the possible elimination of numerical scores for licensing exams, allowing candidates to elect for scores on only pass-fail reporting, holding back numeric scores until after a successful match, and creating new composite scoring to include clinical skills examination data.  Dialogue also continued around COMLEX-USA and USMLE equality by ACGME residency program directors — an important clarification that has been contributing to heightened stress among DO students.

The discussions and debates are likely to heat up across the house of medicine, and the NBOME, as usual, will be actively listening and participating. The argument not to continue the status quo is supported by increasing evidence of anxiety, depression and other mental health issues among MD and DO medical students, which clearly translates to a patient safety issue. As we delve deeper into this topic, we will endeavor to remain true to our mission of supporting assessment features that protect the public and patients, who are ultimately served by those who rely on our products and services, both for the primary and important secondary purposes of the assessments.

Here’s a quick recap of what happened during this year’s match programs:

The 2019 AOA Match

19% of graduating senior students and several hundred prior graduates participated in the American Osteopathic Association (AOA) residency match this year, and of those, 54% matched to a program, with more than 500 placed into primary care specialties. The top 5 specialties in the AOA Match this year included Family Medicine, with 34% of placements, followed by Internal Medicine at 22%, and Orthopedic Surgery at 12%. Many of the remaining 81% of graduating seniors went on to participate in the NRMP match the following month.

The 2019 NRMP Main Residency Match

NRMP’s Match Day saw a total of 38,367 total applicants submitting program choices for 35,185 residency slots. In 2018, there were 37,103 applicants and 33,167 slots – an increase of 1,264 and 2,018 respectively. This year, a record number of 6,001 osteopathic medical students and graduates submitted NRMP program choices, and 84.6 percent of those matched to PGY-1 positions — an all-time high. The number of osteopathic medical students participating in the NRMP match has risen by 3,052 candidates, a 103 percent increase since 2015.

The 2019 NRMP Specialty and Fellowship Match

We also saw impressive results from our osteopathic graduates in the NRMP’s specialty match this year as well. Osteopathic residents enjoyed a 78.9% match rate — a 2% increase over 2018. And a big congratulations to the 1000+ new DOs who matched to ACGME Fellowship programs, more than any year prior. The ACGME further clarified that COMLEX-USA and USMLE are both acceptable for Fellowship Programs in their updated Common Program Requirements.

Congratulations to the nation’s DO and MD students and residents who were successful in navigating the matching processes in 2019, and to the schools and residency programs for hopefully facilitating successful “matches” and transition processes as students transition to become residents, and some residents move on to fellowship training.



Contributed by John R. Gimpel, DO, MEd | President and CEO | NBOME

This year’s ACGME Annual Education Conference featured several hundred presentations, sessions and networking opportunities for more than 4,000 educators and learners in the graduate medical education community. The NBOME was delighted to participate as a key sponsor again this year, hosting an exhibit booth that welcomed hundreds of visitors. In addition, NBOME hosted a pre-conference workshop and panel discussion entitled “Using COMLEX-USA in Graduate Medical Education”, and members of the NBOME team took part in numerous meetings and sessions including those hosted by the Assembly of Osteopathic Graduate Medical Educators and numerous ACGME Residency Review committees.

The overarching theme of the conference, Rediscovering Meaning in Medicine, was reflective and asked those in attendance to take a deeper dive into the joys and challenges in training the nation’s physician workforce ─ looking at doctors as people, giving care and caring, hope and healing to themselves, each other and their patients.

As a non-physician, who has had the pleasure to work with many talented and inspiring physicians and physician leaders (and yes, sometimes depleted) in academic, clinical practice and assessment settings I found it enlightening. Having witnessed both physician burnout and dis-wellness , the words, “Physician, heal thyself,” resonate. As in an airplane when instructed to place the mask over your face before assisting others, the concept that physician educators need to take care of themselves is clear. The importance of the teaching and learning environment where new physicians enter and learn how to be doctors was brought forth in numerous ways throughout the course of the conference. The words of Vivek Murthy, MD, U.S. Surgeon General, were heart-felt as he relayed his experience with feeling a fear of failure, and transforming that fear to courage.

Having the opportunity to work with osteopathic medical leaders, I was struck by the opportunities for improving the clinical learning environment for trainees and their mentors and strengths I’ve seen in the osteopathic medical profession. I have been to hundreds of meetings with doctors of osteopathic medicine (DOs) and in virtually every meeting, these doctors greet each other warmly, often hugging one another. Their warmth and expression conveys meaning to one another – you are not alone, you matter to me, we are on the same team.

As the single accreditation system for graduate medical education comes to fruition in the next year and matures, the integration of newly minted DOs entering the larger house of medicine may bring a greater sense of human-ness and warmth to all physician learners, teachers and members of the inter-professional care team, and to the patients they serve. With osteopathic medical students now accounting for one in four medical students and later joining a unified system for residency training, can this expression of caring and support promote courage and help create the kind of clinical learning environment that encompasses us all?



Contributed by Melissa Turner, MS  |  Associate Vice President for Strategy & Quality Initiatives  |  NBOME

2019 US Osteopathic Medical Regulatory Summit

The 3rd United States Osteopathic Medical Regulatory Summit took place February 28 – March 1, following the Midyear Meeting of the Board of Trustees of the American Osteopathic Association (AOA) in Naples, Florida. This unique event provided a forum for attendees to explore and define the distinctive elements of osteopathic medical regulation as reflected in osteopathic undergraduate and graduate medical education and accreditation, assessment and licensure, as well as board certification and continuing medical education.

Sponsored by the NBOME, the AOA, and the American Association of Colleges of Osteopathic Medicine (AACOM) with support from the Osteopathic Heritage Foundations (OHF) and the Osteopathic Founders Foundation (OFF), the two-day event included osteopathic medical students, residents, and other key stakeholders in osteopathic medical regulation. During the summit, attendees reviewed the evidence for the distinctive practice of osteopathic medicine, and the essential elements necessary to ensure high quality, osteopathically distinctive care for our patients in the rapidly changing healthcare arena.

Participants Included Representatives from:

  • The Assembly of Osteopathic Graduate Medical Educators
  • American Association of Osteopathic Examiners
  • Accreditation Council for Graduate Medical Education
  • AOA Bureau of Osteopathic Specialists
  • AOA Bureau of Emerging Leaders
  • AOA Commission on Osteopathic College Accreditation
  • Federation of State Medical Boards
  • Osteopathic Heritage Foundations
  • Osteopathic Founders Foundation
  • Student of Osteopathic Medical Association
  • Council of Osteopathic Student Government Presidents

2019 ACGME Annual Education Conference

The 2019 ACGME Annual Education Conference held in early March featured several hundred presentations, sessions and networking opportunities for more than 4,000 educators and learners in the graduate medical education community. The NBOME exhibit booth welcomed hundreds of residency program directors and coordinators, while members of our board and Executive Leadership Team hosted a number of information sessions throughout the conference.

Drs. John Gimpel and Michael Finley hosted a presentation entitled “COMLEX-USA Use by Program Directors as Part of a Comprehensive Assessment System”. The session highlighted areas of harmonization between the COMLEX-USA completed by all osteopathic medical students and DO residents, and the structure of a high performing resident assessment system.

Meanwhile, NBOME Board Secretary-Treasurer, Richard J. LaBaere II, DO, MPH took part in a pre-conference course on The Basics of Institutional Accreditation, and another info session on Improving Graduate Medical Education.

These and other NBOME representatives participated in numerous additional meetings including those hosted by the Assembly of Osteopathic Graduate Medical Educators and numerous ACGME Residency Review Committees. For more on ACGME 2019, see our Reflections on Rediscovering Meaning Medicine article here.

2019 AACOM Annual Conference: Educating Leaders

This April, the NBOME joined a diverse group of attendees, including university and college leadership, government officials, researchers, undergraduate and graduate medical educators, at this year’s AACOM Annual Conference. We explored the topic of big data with friends and colleagues as we discussed the foundations of evidence-based decision making.  NBOME had the opportunity to share update presentations with the AACOM Board of Deans, the Council of Osteopathic Student Government Presidents, and the Educational Council on Osteopathic Principles.

In advance of the conference, NBOME hosted a special workshop and panel discussion focused on the use of national assessment data as part of a comprehensive strategy for demonstrating educational outcomes and continuous quality improvement. Members of the NBOME team (pictured above) also had the opportunity to present NBOME and COMLEX-USA updates to a packed house, during our sponsored luncheon event.

For more insights on the topic of big data and its impact on osteopathic medical licensure, check out our pre-conference article content here.

Coming Up

In the next quarter we’ll be making appearances at the following conferences and meetings:

At the turn of the 20th century, the practice of medicine was rudimentary to say the least.  Patients were medicated with such dubious treatments as castor oil, whiskey, arsenic — and of course bloodletting by way of leeches.  Unsanitary surgeries resulted in few cures — and many deaths.

It was at this time that Andrew Taylor Still, MD pioneered a radical new form of medicine. He believed that traditional medicine of the time treated symptoms rather than the underlying cause of disease itself, and that the object of the physician was to take a holistic approach to understanding the patient, body-mind-spirit, and to find and promote health. He postulated that “rational medical therapy” ought to consist of those therapies that promote health, including osteopathic manipulative treatment of the musculoskeletal system and surgery – the drugs available at the time should only be used sparingly. In 1892 he founded The American School of Osteopathy, in Kirksville, Missouri, and began teaching osteopathic medicine according to this methods and philosophy.

If A. T. Still were alive today he would see that many of his then “radical” ideas have been adopted by mainstream medicine. Any good doctor today, regardless of their suffix, will likely ask questions about body mind and spirit, “How is your diet?”, “How are things at home?”, “Are you feeling stressed at work?”, “What are your concerns about your symptoms.” Dr. Still would also be elated at the growth of the profession. The fact that we’re now approaching 150,000 DOs and osteopathic medical students in the United States, and that one in four US students starting medical school today have chosen to enter an osteopathic medical school would be very satisfying. He’d also be very pleased to see osteopathic medicine spreading across the globe and take great pride in the legacy he left behind, and the impact his philosophy has had on modern medicine and healthcare.

Unfortunately such widespread acceptance can present the risk of forgetting one’s roots.  Dr. Still may be concerned that a number of DO students opt not to practice or develop their OMT skills in clinical training and residency, and are unable to offer this treatment to patients. He might wonder why more resources have not been dedicated to outcomes research. Most DOs are very patient-oriented by nature, and focus on care more than designing research protocols and crunching numbers. On the other hand, pharma is able to offer massive funding to study pharmacological interventions. There has been some research on our end, but more needs to be studied on OMT outcomes versus other methods, showing which patients are healthier post-treatment, which patients are getting back to work faster, and which patients are more satisfied with their recovery.

We’re extremely proud to continue the legacy of A. T. Still, and make conscious efforts to remember where we came from.



Based on an interview with John R. Gimpel, DO, MEd  |  President and CEO  |  NBOME


For obvious reasons, all of us at the NBOME are pretty big fans of DOs. In honor of National Osteopathic Medicine (NOM) and International Osteopathic Healthcare (IOH) Week, we’re counting down 4 things we love about DOs.

Whole person care

Osteopathic medicine is based on a different philosophy. A DO sees the big picture. They believe in the body’s tendency towards self-healing, and seek to nurture that above all. They have a keen sense of the interconnectedness of the entire body’s parts and systems. Yes they’ll still prescribe you medicine if you need it, though they may point out that it’s not the medicine that heals, rather, the medicine gives your body the tools it needs to heal itself. They’re also known to find holistic workarounds to problems that may be better treated without medicine. If you come to see them for a headache, they might prescribe you some stretches. You’d also be shocked at the wonders that can be worked with a little bit of yoga.

Osteopathic Medical Treatment (OMT)

DO’s learn the same science and techniques as MDs. However, they do learn a very special skill set that, while any doctor is welcome to learn and practice, not all doctors are required to know. DO training contains a very specialized focus on the musculoskeletal system, the system of bones, muscles, ligaments and connective tissue that literally holds us all together. With this knowledge they are able to perform Osteopathic Manipulative Treatment, (OMT) allowing them to diagnose and often alleviate certain health problems simply using their hands. It’s not just the magic DO touch, it’s OMT.

The DO Credentials

The letters “DO” at the end of a doctors name say a lot of things. First of all, this person is fully licensed to practice medicine anywhere in the United State (and 44 countries abroad). They also say that this person has practiced OMT, and most importantly, that this doctor is committed to whole-person care of mind body and spirit and the body’s natural tendency towards self-healing. The NBOME plays a big role in granting these credentials, so perhaps we are a bit biased. But it is nice to look at the DO suffix on a doctor’s name and know you’ll be in good hands.

They’re Huggers

To our knowledge, no formal study has ever been conducted on the matter. Anecdotally, however, DO’s have a reputation as huggers. It’s hard to quantify, but there’s a certain intangible warmth to our DOs that often manifests as a hug. If you are inclined, NOM Week may be a good opportunity to conduct independent research into this phenomenon by offering your DO a friendly hug.

Today kicks off an important week for the osteopathic community. National Osteopathic Medicine (NOM) and International Osteopathic Healthcare (IOH) Week. It’s a week to let your DO flag fly, to celebrate your osteopathically distinctive pride, and spread the word about DOs and their approach to medicine far and wide.

At the NBOME we have a number of events planned to help get the word out. From yoga sessions, and a company-wide game of osteopathic Family Feud, to free blood pressure screenings for our office building and community neighbors, it’s going to be a jam packed week.

Here are some ways you can engage those around you during NOM and IOH Week:

  1. Educate your patients. And help them more fully understand the osteopathic approach to care.
  2. Enhance your elevator speech. Are you able to easily articulate what osteopathic medicine is what makes it distinctive?  Honing your story will can help you communicate more clearly with not just your patients, but everyone in your life — from your friends, to your family, to your uber driver.
  3. Show Your DO Pride on Social Media by participating in the AOA’s DO Pride Photo Contest.

Stumped about crafting your elevator speech?  We’ll help you out a little bit with that one tomorrow. In fact, check back all this week for more articles about DO’s and how proud we are to play our own role in this unique profession.

Happy NOM and IOH Week from all of us at the NBOME!

A little bit of art. And a whole lot of science. Both elements have been woven deep into the DNA of the NBOME and our portfolio of assessments since our start 85 years ago. The study of evidence, validity, reliability and defensibility – the science, is met with the art of assessment.

Of course it’s not the volume of data, but what organizations do with it that truly matters.

Hundreds of thousands of data elements are shared securely between the NBOME and UME and GME institutions. Routinely, the data we collect is leveraged internally to enhance the precision of our test development processes, our quality assurance standards, and our commitment to high standards in assessment.

We are also acutely aware of how our data is utilized by those around us — specifically when it comes to the path to osteopathic medical licensure. In addition to using data for student and resident assessment and promotion, national standardized assessment data related to COMLEX-USA, COMSAE, and COMAT is frequently used to enable continuous improvement in curricular approaches and clinical learning environments.

With Big Data comes Big Responsibility. 

To promote responsible use of COMLEX-USA scores, the NBOME recommends those who use this data develop a strong understanding of what the examinations measure, how standards are set, what the scores mean, and how they may correlate to performance in residency and practice.

With heightened awareness of the implications big data has in our universe, we are tremendously proud of our role in helping to ensure licensed DOs are qualified to care for patients by passing valid, evidence-based assessments designed specifically for osteopathic medical practice — all of which is backed up by extensive (big) data-driven decision-making.








Connecting with osteopathic medical students as well as leadership and faculty at colleges of osteopathic medicine is very important to us.

These visits enable us to provide up-to-date information on the latest enhancements to COMLEX-USA, facilitate item writing workshops and other faculty development programs, answer questions, and most importantly, gain valuable feedback as we continue our work to improve the assessment experience for both students and COM faculty.

Here’s where we’ve been so far this year…




On April 5th,  Dr. Finley visited Western University of Health Sciences’ College of Osteopathic Medicine of the Pacific, where he met with students, faculty, and their career development team.





On March 26th, Dr. Finley visited Michigan State University College of Osteopathic Medicine in East Lansing, Michigan, to provide NBOME and COMLEX-USA updates to both faculty and students.





On February 22nd, Dr. Gimpel visited Nova Southeastern University Dr. Kiran C. Patel College of Osteopathic Medicine in Fort Lauderdale, Florida, to provide NBOME and COMLEX-USA updates to both faculty and students.





On February 19th, Dr. Gross visited Georgia Campus- Philadelphia College of Osteopathic Medicine in Suwanee, Georgia, to provide COMLEX-USA Level 2-PE updates to both faculty and students.




On February 15th, Dr. Gimpel visited Campbell University School of Osteopathic Medicine in Lilington, North Carolina, to provide NBOME and COMLEX-USA updates to both faculty and students.





On February 7th, Dr. Gimpel visited Ohio University Heritage College of Osteopathic Medicine in Dublin, Ohio, to provide NBOME and COMLEX-USA updates to both faculty and students.





On February 1st, Dr. Gimpel visited West Virginia School of Osteopathic Medicine in Lewisburg, West Virginia, to provide NBOME and COMLEX-USA updates to both faculty and students.





Reference Materials

Fundamental Osteopathic Medical Competency Domains


COMLEX-USA for Residency Program Directors


COMLEX USA Faculty Review Program

Upcoming Changes

COMLEX-USA Master Blueprint Effective 2018


Client Registration System Tutorial

Practice Exams

Practice Exam – New COMLEX-USA Level 1

As National Physicians Week comes to a close on Saturday we’ve been giving a little extra thought to the incredible doctors in our lives. Their contributions, dedication and sacrifice are overwhelming especially as we consider the origins of National Physicians Week and its focus on mental health and wellness.

We at the NBOME are so very appreciative of the expertise and insight the physicians on our staff bring to our organization and are thrilled to mark National Physicians Week and National Doctors Day by delivering red carnations to the many doctors in our midst.


We hope you will also take a moment to recognize the physicians in YOUR life, whether they’re your friends, your colleagues, or your own care providers. Each, in their own way, is responsible for our health and wellness – and this is an opportunity to return the favor. Check in and ensure THEY are equally healthy and well — say thanks, offer a friendly hug (DOs especially love hugs), and share a few words of love and support for all they do!


Want to read more?  Check out the AOA’s 10 stories from 10 inspiring DOs.

In this section:

In this section:

In this section:


COMLEX-USA Faculty Review Program

The NBOME provides approved faculty members of colleges of osteopathic medicine the opportunity to conduct a COMLEX-USA review, which enables them to experience a computer-based examination similar to what candidates experience.

We currently offer two computer-based examinations for faculty to review: Faculty Review Form for COMLEX-USA Level 1 and Faculty Review Form for COMLEX-USA Level 2 Cognitive Evaluation (Level 2-CE).

Each examination has 200 items. To preserve the integrity of the examination, while offering the most realistic simulation of the COMLEX-USA possible, the examinations will consist of questions that have been recently retired. Please note, the content and format of items are identical to what is delivered to candidates.


COMLEX-USA Faculty Review examinations are available six days a week (every day except Sunday), year-round, at Prometric testing centers. Faculty members must register for and schedule the examinations online, and will be allowed up to two hours to complete the review. Although faculty reviewers can select answers for each item, their answers will not be scored and they will not receive score reports.

At the end of the review, the reviewers are provided with a survey. The NBOME will review the comments but cannot respond to individual reviewers. If you are a faculty member of an accredited college of osteopathic medicine and are interested in participating in a review examination, please contact the dean of your school.


To register faculty members for the COMLEX-USA Faculty Review examination, deans or their designees will need to log in to their school’s dean’s page and submit their information.

For more information on faculty review program for COMLEX-USA, contact Client Services at 866.479.6828.

In this section:

COMLEX-USA · Level 1

  • Eligibility
  • Blueprint
  • Examination Format
  • Registration & Scheduling
  • Examination Dates
  • What To Expect on Exam Day
  • Test Accommodations
  • Practice & Preparation
  • Scoring & Reporting

  • In this section:

    This content is for candidates who took the COMLEX-USA Level 2-CE before June 2019. Candidates taking the COMLEX-USA Level 2-CE beginning June 17, 2019 can view the new documentation here.

    This content is for candidates taking the COMLEX-USA Level 2-CE beginning June 17, 2019. Candidates taking the COMLEX-USA Level 2-CE before June 2019 can view the old documentation here.

    This content is for candidates who took the COMLEX-USA Level 1 before May 2019. Click here to view the current documentation.

    This content is for candidates taking the COMLEX-USA Level 1 beginning May 20, 2019. Candidates taking the COMLEX-USA Level 1 until April 16, 2019 can view the old documentation here.

    As we mark the end of Black History Month 2019, we are reminded of the many men and women of color who, with a passion for people and community, contribute their talents to the practice and evolution of osteopathic medicine.  Among the many influential voices and thought leaders in our field, the name William G. Anderson, DO, is certainly a stand out.  We would like to take this opportunity to recognize his advocacy of the profession and his role in the Civil Rights Movement.

    Dr. Anderson, a professor of surgery and senior advisor to the dean at the Michigan State University College of Osteopathic Medicine (MSU-COM), holds the distinction of being the first African American on the American Osteopathic Association Board of Trustees and served as the president of the American Osteopathic Association in 1994 and 1995. He also served as associate dean of the Kirksville College of Osteopathic Medicine and as clinical professor of osteopathic surgery at MSU-COM. Dr. Anderson was an active member of the NBOME Board of Directors from 2003 through 2014 and was member of its Executive Committee from 2007 to 2010.

    Born in Americus, Georgia, in 1927, Dr. Anderson attended Des Moines University College of Osteopathic Medicine. In Albany, Georgia, he was prevented from treating patients because of segregationist policies in 1957. In response, he founded and became the first president of the Albany Movement, which worked to register African American voters and devised ways to end racial segregation. With this achievement, Dr. Anderson became a pioneer in the Civil Rights Movement and was a personal friend and colleague of Dr. Martin Luther King, Jr. A member of the Physicians for Social Responsibility, he is a frequent speaker on osteopathic medicine and civil rights topics. In 2014, the MSU-COM’s award-winning civil rights lecture series was renamed as the “Dr. William G. Anderson Lecture Series: Slavery to Freedom,” which continues through this year – sadly, the final year of the series.

    For more on this year’s lecture series, click here.

    First and second year osteopathic medical students were recently given an overview of COMLEX-USA Level 2-PE (performance evaluation) testing from Gretta A. Gross, DO ‘97, MEd, Vice President for Clinical Skills Testing for NBOME. Dr. Gross, a graduate of Philadelphia College of Osteopathic Medicine (PCOM), discussed the mission of the NBOME and ways to prepare for this exam, which assesses the fundamental clinical skills necessary to enter into supervised graduate medical education.

    The NBOME, founded in 1934, is an independent, nongovernmental, not-for-profit organization with the mission of protecting the public by assessing competencies for osteopathic medicine and related healthcare professions. The COMLEX-USA series is the primary pathway to licensure for physicians seeking to practice osteopathic medicine and surgery.

    Dr. Gross explained that the performance evaluation is a standardized patient-based assessment of fundamental clinical skills essential for osteopathic patient care, while the COMLEX-USA Level 2-CE (cognitive evaluation) examination is a computer-based application of osteopathic medical knowledge concepts related to clinical sciences, patient presentations and physician tasks.

    To take Level 2-PE and 2-CE exams, students must have completed their second year at an accredited college of osteopathic medicine, must have passed the COMLEX-USA Level 1 exam following their second year of medical school, and be in good academic and professional standing at their school.

    According to Dr. Gross, the performance evaluation occurs at two NBOME testing centers—one in Conshohocken, Pennsylvania, a suburb of Philadelphia, and the other in Chicago, Illinois. The test takes place during a six hour period and includes 12 standardized patient-based cases allowing 14 minutes for each patient encounter plus nine minutes to document findings in an e-SOAP note, also known as a Subjective Objective Assessment Plan format. In addition, she said, 15 minute breaks take place after every four patient encounters stretching the time at the testing center to seven hours.

    Dr. Gross explained that the exam, which tests whether or not students can demonstrate competency in the fundamental clinical skills and related competencies, is graded in two domains – the humanistic domain which tests physician/patient communication and interpersonal skills, as well as professionalism, and the biomedical/biomechanical domain which tests medical history taking and physical exam skills, documentation skills and osteopathic manipulative treatment. The exam, scored by 30 individuals, is “not designed to provide feedback,” she said, as results are provided solely as pass/fail and generally reported 8-10 weeks after the examination has been taken.

    The most common ways students prepare for the test, she explained, are through clinical rotations, standardized patient encounters, books and courses on physical diagnosis, as well as a COMLEX-USA Level 2-PE prep course. But the basics of preparation include reviewing the NBOME website, reading the Level 2-PE orientation guide, watching the NBOME video and practicing with eSOAP notes.

    Dr. Gross explained that the pass rate for the exam is historically between 92 and 93 percent. She said that students usually prefer to take the exam between the spring of their third year and the summer of their fourth year while the exam is offered year round. She advised students to consider scheduling the exam, which costs $1,295, as soon as they are eligible as seats are released on a rolling basis one year in advance.

    About PCOM Georgia

    PCOM Georgia is a private, not-for-profit branch campus of the fully accredited Philadelphia College of Osteopathic Medicine, a multi-program institution of educational excellence founded in 1899. PCOM Georgia offers the doctor of osteopathic medicine degree, the doctor of pharmacy degree, the doctor of physical therapy degree, as well as graduate degrees in biomedical sciences and physician assistant studies. The campus, located in Suwanee, Georgia, is also home to the Georgia Osteopathic Care Center, an osteopathic manipulative medicine clinic, which is open to the public by appointment. For more information, visit www.pcom.edu or call 678-225-7500.



    To read this article as it originally appeared on PCOM’s website, click here.

    To access additional information about COMLEX-USA Level 2-PE, click here.

    The latest in our monthly “Day in the Life” series took us into the day-to-day of NBOME’s Standardized Patients (SPs). COMLEX-USA Level 2-PE candidates from all over the country travel to our facilities in Philadelphia and Chicago to meet with these folks as they simulate real patients in order to assess the candidates’ performance in examination rooms. SP Trainer Assistants Joanne Cunningham (Philadelphia) and Candace Dickerson (Chicago) took attendees through what it takes to be an SP.

    The concept of a standardized patient was introduced in the 1960s by Howard S. Barrows, M.D., with the idea to simulate a real-life patient in a manner so realistic and consistent that encounters with them can provide a fair standard for assessment of aspiring doctors. SPs must respond consistently to each candidate. A candidate asking the same questions and performing the same maneuvers should learn the same information, regardless of whether he/she sees an SP in Chicago or Philadelphia. Keeping the simulation standardized allows every candidate an equal opportunity to demonstrate his or her skills in key clinical and interpersonal areas.

    Where do SPs Come From?

    Almost anyone can be an SP. In fact, that’s what we aim for. Doctors treat a diverse range of ages, ethnicities, and backgrounds, so the NBOME works to ensure a similar range in their SP pools. We look for reliability, recall skills, and the ability to follow directions and think and act within the context of a case, as well as fairness and confidentiality.

    Basic Training

    SPs are highly trained, so even a skilled clinician might be hard-pressed to distinguish the difference between a real patient and an SP. The extensive training process involves several days of training in both a particular case and in communication assessment, followed by another several days of “pretest” practice and quality assurance, and can take anywhere from a few weeks to a couple months. An SP’s performance and scoring is then reviewed by a team of physicians, psychometricians, and other experts who need to sign off before the SP can go “live” in the exam. Training continues even after this and lasts throughout an SP’s career at the NBOME, with physicians and trainers regularly monitoring SP performance, and SPs regularly monitoring their own performance as well as that of any counterparts on their case.

    Day In The Life

    The COMLEX-USA Level 2-PE is offered most mornings, along with several afternoons and some Saturdays each month. A minimum of 14 SPs are scheduled for each exam session, 12 to be part of the exam, seeing candidates, and 2 who are on “standby” in case of callouts or other changes. SPs arrive at 8AM for a morning exam and 2PM for an afternoon exam, and work until about 3:30 PM or 9:30 PM, respectively. They sign out their case, pick their lunch (notable perk of the SP lifestyle) and set up their examination room to its standard configuration. They review their case and report any physical findings (e.g., a bruise, a runny nose, or a cough) to the Trainer on Duty (TOD). Any findings SPs may have could, depending on the case, interfere with their portrayal and lead candidates to incorrect conclusions. Accordingly, SPs are required to report such findings to the TOD who, with a staff physician’s assistance when needed, will determine that the SP needs to be replaced with one of the “standby” SPs or, if the finding won’t add confusion, provide the SP with a response to use if asked about the finding. SPs then have time before the exam begins, time they may spend meeting with their trainers, undergoing a medical screening with a staff physician, watching encounter videos, or studying their case.

    At 9AM/3PM, SPs meet for announcements from the TOD, receive blueprints for their day’s encounters, and sign in to the computers in their exam rooms. SPs report to their rooms by 9:20AM/3:20PM, and exams begin promptly at 9:30/3:30 (barring delays).

    For each session, SPs see up to 12 Candidates for 12 encounters (consisting of 14 minutes each). Candidates take patient histories, perform physical examinations, and sometimes perform osteopathic manipulative treatment (OMT)*.  After an encounter is finished, candidates have 9 minutes to write a patient note, during which time the SPs score the encounter, reset their rooms, and prepare themselves for the next encounter. Everyone prepares differently—between encounters you’ll see SPs walking around, reading, playing games, doing jumping jacks, and even balancing their checkbooks—whatever it takes for them to put aside the last encounter and treat the next one like it’s their first. SPs who are on standby also experience 12 encounters in a day, but for them it’s on video: they watch and rescore all encounters from one of their past days in the exam or, if applicable, one of their counterpart’s days.

    Students need to demonstrate competent clinical and communication skills before they can progress to residency, yet it’s impossible to assess these skills through traditional question and answer testing. That’s what makes COMLEX-USA Level 2-PE, and our SPs, so special. Through the work of our dedicated SPs, we’re able to test this critical component of medicine, protecting the public in the process.

    If you or someone you know might be interested in working as an SP at the NBOME, please direct them to our website: https://www.nbome.org/who-we-are/employment/standardized-patients/.


    *Candidates are not allowed to perform any invasive procedures.

    While we do our best to regularly tell those we love and appreciate how much we value them every day, Valentine’s Day provides us a special platform to spread the love even further.  This Valentine’s Day, the NBOME is sending some extra love to the members of our COMLEX-USA Resident Ambassador Program, who are a huge part of helping our organization grow and thrive within the osteopathic medical community.

    The mission of the NBOME holds a special place in our hearts — and in the hearts of our Resident Ambassadors, who work hard to represent us as we protect the public by providing the means to assess the competencies for osteopathic medicine.

    This elite group of resident physicians supports the profession by advocating for osteopathic qualification and the COMLEX-USA examination series.  They help NBOME expand the public’s knowledge and understanding of COMLEX-USA and encourage osteopathic distinctive assessment outreach in this era of the new single accreditation system for graduate medical education.  With this in mind, there has never been a more important time to have dedicated, passionate advocates like our Resident Ambassadors to help spread the word by uploading content, sharing information, dispelling myths about the COMLEX-USA exam, and much more.

    An ode to our Resident Ambassadors:
    ♥  We love you for advocating for our mission.
    ♥  We love you for publicly sharing your experiences related to the COMLEX-USA examination series.
    ♥  We love you for providing us a view into your challenges and successes as we work to improve the DO journey
      We love you for educating others as we work collectively to advance the distinctiveness of the osteopathic medical profession.

    Ronak Mistry, DO; Jon Bardahl, DO; Carisa Champion, DO, JD, MPH; and Sarah Wolff, DO — Thank you for all you do!

    For more info about COMLEX-USA, click here.

    LEAD Conference

    On January 24-25, the NBOME attended the American Osteopathic Association (AOA) Leadership, Education, Advocacy & Development (LEAD) conference, in Las Vegas, Nevada. The AOA LEAD conference delivers training to individuals throughout the osteopathic medical profession, with additional content delivered through focused tracks to provide advanced educational content. Amongst those representing the NBOME were John R. Gimpel, DO, MEd, Geraldine O’Shea, DO, and Sandra Waters, MEM, Vice President for Collaborative Assessments & Initiatives.

    During LEAD, Drs. Gimpel and O’Shea, and Ms. Waters, interacted with members of the osteopathic medical community, including other affiliate organizations, COMs, program directors, licensing boards, specialty boards and colleges, as well as members of NBOME’s own National Faculty.

    NBOME representatives actively participated in a variety of conference sessions focused on key topics ranging from physician wellness, the impact of big data, changes in AOA board certification, updates to accreditation/billing and reporting, and the involvement of NBOME National Faculty. They also met with AOA regulatory affairs and advocacy to talk through top federal and state issues involving medical degrees and scope of practice, as well as physician access and affordability.

    Closing out the LEAD conference on Friday evening was Ron Burns, DO, AOA President-Elect, as well as Kate de Klerk, OMS IV, SOMA National President. At this session, student and young physician leaders spoke of the exciting AMA Resolution 955: Equality of COMLEX-USA and USMLE, and its importance to osteopathic distinction and professional identity formation for young DOs.

    AAOE Educational Summit

    Immediately following the LEAD conference, Drs. Gimpel and O’Shea, and Ms. Waters provided an NBOME and COMLEX-USA update to the American Association of Osteopathic Examiners. Input was solicited from the osteopathic medical licensing board community on a number of initiatives, and updates on the enhanced COMLEX-USA Level 3, residency program directors attestation, numeric and pass-fail scores for COMLEX-USA, and the AMA’s historic policy (AMA Resolution 955) supporting DO students, residents and osteopathic distinction by advocating for the equality for COMLEX-USA and USMLE.

    It was a year of dichotomy for women’s rights as numerous examples of sexism rose to the surface and were met with the dramatic emergence of the #MeToo and #TimesUp movements. As we look optimistically ahead to the future, we are acutely aware of the journey that has brought us here, as well as the strong women (and men) who have helped pave the way for what comes next.

    The NBOME is pleased to celebrate one of these trailblazing women on February 3rd, which marks the 4th annual National Women Physician Day — a day we devote to celebrating female physicians, and also the birthday of Dr. Elizabeth Blackwell, a woman who changed the landscape for women in medicine by becoming the first American woman to hold a medical degree. Dr. Blackwell and others like her refused to accept the status quo of gender inequality, and National Women Physicians Day is a time we take to honor those courageous women who helped pave the way helped to improve the medical profession for those who came after them.

    While medicine has historically been a male dominated profession, more recently we have seen a significant increase in women physicians and more importantly, women leaders. According to the AOA, women make up 41% of the practicing DOs and osteopathic medical students in the US, several of whom are in our midst every day.

    We are honored to have such empowered and compelling role models in our midst, and would like to take this moment to thank them all for their contributions.