Gender Differential Item Functioning on A National Field-specific Test: The Case of PhD Entrance Exam of TEFL in Iran

Document Type: Original Article


Shiraz University, Iran


Differential Item Functioning (DIF) exists when examinees of equal ability from different groups have different probabilities of successful performance in a certain item. This study examined gender differential item functioning across the PhD Entrance Exam of TEFL (PEET) in Iran, using both logistic regression (LR) and one-parameter item response theory (1-p IRT) models. The PEET is a national test consisting of a centralized written examination designed to provide information on the eligibility of PhD applicants of TEFL to enter PhD programs. The 2013 administration of this test provided score data for a sample of 999 Iranian PhD applicants consisting of 397 males and 602 females. First, the data were subjected to DIF analysis through logistic regression (LR) model. Then, to triangulate the findings, a 1-p IRT procedure was applied. The results indicated (1) more items flagged for DIF by LR than by 1-p IRT (2) DIF cancellation (the number of DIF items were equal for both males and females), as revealed through LR, (3) equal number of uniform and non-uniform DIF, as tracked via LR, and (4) female superiority in the test performance, as revealed via IRT analysis. Overall, the findings of the study indicated that PEET suffers from DIF. As such, test developers and policymakers (like NOET & MSRT) are recommended to take these findings into serious consideration and exercise care in fair test practice by dedicating effort to more unbiased test development and decision making.