Experts vs. Automata: A Comparative Study of Methods for a Priori Prediction of MCQ Difficulty

Tracking #: 1976-3189

This paper is currently under review
Ghader Kurdi
Jared Leo
Nicolas Matentzoglu
Bijan Parsia
Uli Sattler
Sophie Forge
Gina Donato
Will Dowling

Responsible editor: 
Lora Aroyo

Submission type: 
Full Paper
Successful exams require a balance of easy, medium, and difficult questions. Question difficulty is generally either estimated by an expert or determined after an exam is taken. The latter is useless for new questions and the former is expensive. Additionally, it is not known whether expert prediction is indeed a good proxy for difficulty. In this paper, we compare two ontology-based measures for difficulty prediction with each other and with expert prediction (by 15 experts) against exam performance (of 12 residents) over a corpus of 231 medical case-based questions. We found that one measure (relation strength indicativeness) to be of comparable performance (accuracy = 47%) to the experts (average accuracy = 49%).
Full PDF Version: 
Under Review